A Three-Stage Model Fusion Method for Out-of-Distribution Generalization

Wang, Jiahao; Wang, Hao; Dong, Zhuojun; Yang, Hua; Yang, Yuting; Bao, Qianyue; Liu, Fang; Jiao, LiCheng

doi:10.1007/978-3-031-25075-0_33

Jiahao Wang¹⁰,
Hao Wang¹¹,
Zhuojun Dong¹⁰,
Hua Yang¹⁰,
Yuting Yang¹⁰,
Qianyue Bao¹⁰,
Fang Liu¹⁰ &
…
LiCheng Jiao¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13806))

Included in the following conference series:

European Conference on Computer Vision

1381 Accesses

Abstract

Training a model from scratch in a data-deficient environment is a challenging task. In this challenge, multiple differentiated backbones are used to train, and a number of tricks are used to assist in model training, such as initializing weights, mixup, and cutmix. Finally, we propose a three-stage model fusion to improve our accuracy. Our final accuracy of Top-1 on the public test set is 84.62421%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Battaglia, P.W., et al.: Relational inductive biases, deep learning, and graph networks (2018)
Google Scholar
Bello, I., et al.: Revisiting resnets: improved training and scaling strategies. arXiv preprint arXiv:2103.07579 (2021)
Brock, A., De, S., Smith, S.L., Simonyan, K.: High-performance large-scale image recognition without normalization. arXiv preprint arXiv:2102.06171 (2021)
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: additive angular margin loss for deep face recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4690–4699 (2019)
Google Scholar
Dou, Q., Coelho de Castro, D., Kamnitsas, K., Glocker, B.: Domain generalization via model-agnostic learning of semantic features. In: 32nd Proceedings of Conference on Advances in Neural Information Processing Systems(2019)
Google Scholar
Ghifary, M., Kleijn, W.B., Zhang, M., Balduzzi, D.: Domain generalization for object recognition with multi-task autoencoders. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2551–2559 (2015)
Google Scholar
Han, D., Yun, S., Heo, B., Yoo, Y.: RexNet: diminishing representational bottleneck on convolutional neural network. ArXiv abs/2007.00992 (2020)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Huang, L., Zhang, C., Zhang, H.: Self-adaptive training: beyond empirical risk minimization. In: Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M.F., Lin, H. (eds.) Advances in Neural Information Processing Systems. vol. 33, pp. 19365–19376. Curran Associates, Inc. (2020). https://proceedings.neurips.cc/paper/2020/file/e0ab531ec312161511493b002f9be2ee-Paper.pdf
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: 25th Proceedings of Conference on Advances in Neural Information Processing Systems (2012)
Google Scholar
Li, Y., et al.: Deep domain generalization via conditional invariant adversarial networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11219, pp. 647–663. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01267-0_38
Chapter Google Scholar
Radosavovic, I., Kosaraju, R.P., Girshick, R., He, K., Dollar, P.: Designing network design spaces. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2020)
Google Scholar
Ridnik, T., Lawen, H., Noy, A., Friedman, I.: TresNet: high performance GPU-dedicated architecture. In: 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1399–1408 (2021)
Google Scholar
Shankar, S., Piratla, V., Chakrabarti, S., Chaudhuri, S., Jyothi, P., Sarawagi, S.: Generalizing across domains via cross-gradient training. arXiv preprint arXiv:1804.10745 (2018)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016)
Google Scholar
Szegedy, C., et al.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
Tolstikhin, I., et al.: MLP-mixer: an all-MLP architecture for vision (2021)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA. (2017)
Google Scholar
Volpi, R., Namkoong, H., Sener, O., Duchi, J.C., Murino, V., Savarese, S.: Generalizing to unseen domains via adversarial data augmentation. In: 31st Proceedings of Conference on Advances in Neural Information Processing Systems (2018)
Google Scholar
Wang, H., et al.: Cosface: Large margin cosine loss for deep face recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5265–5274 (2018)
Google Scholar
Yuan, L., Hou, Q., Jiang, Z., Feng, J., Yan, S.: Volo: Vision outlooker for visual recognition. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)
Google Scholar
Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: CutMix: regularization strategy to train strong classifiers with localizable features (2019)
Google Scholar
Zhang, H., et al.: Resnest: split-attention networks. ArXiv abs/2004.08955 (2020)
Google Scholar
Zhang, X., Cui, P., Xu, R., Zhou, L., He, Y., Shen, Z.: Deep stable learning for out-of-distribution generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5372–5382 (June 2021)
Google Scholar
Zhang, X., et al.: Towards domain generalization in object detection. arXiv preprint arXiv:2203.14387 (2022)
Zhang, X., Zhou, L., Xu, R., Cui, P., Shen, Z., Liu, H.: Nico++: towards better benchmarking for domain generalization. ArXiv abs/2204.08040 (2022)
Google Scholar
Zhou, K., Liu, Z., Qiao, Y., Xiang, T., Loy, C.C.: Domain generalization: a survey. IEEE Trans. Pattern Anal. Mach. Intell. Early Access 1–20 (2022). https://doi.org/10.1109/TPAMI.2022.3195549

Download references

Acknowledgement

Throughout the writing of this dissertation I have received a great deal of support and assistance. I would first like to thank my tutors, for their valuable guidance throughout my studies. You provided me with the tools that I needed to choose the right direction and successfully complete my competition. I would particularly like to acknowledge my teammate, for their wonderful collaboration and patient support. Finally, I would not have been able to get in touch with this competition without the support of the organizer, NICO, who provided a good competition environment and reasonable competition opinions.

Thanks to the support of the National Natural Science Foundation of China (No. 62076192), Key Research and Development Program in Shaanxi Province of China (No. 2019ZDLGY03-06), the State Key Program of National Natural Science of China (No. 61836009), the Program for Cheung Kong Scholars and Innovative Research Team in University (No. IRT_15R53), The Fund for Foreign Scholars in University Research and Teaching Programs (the 111 Project) (No. B07048), the Key Scientific Technological Innovation Research Project by Ministry of Education, the National Key Research and Development Program of China, and the CAAI Huawei MindSpore Open Fund.

Author information

Authors and Affiliations

Key Laboratory of Intelligent Perception and Image Understanding, X’ian, China
Jiahao Wang, Zhuojun Dong, Hua Yang, Yuting Yang, Qianyue Bao, Fang Liu & LiCheng Jiao
School of Artificial Intelligence, Xidian University, X’ian, Shaanxi Province, China
Hao Wang

Authors

Jiahao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhuojun Dong
View author publications
You can also search for this author in PubMed Google Scholar
Hua Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yuting Yang
View author publications
You can also search for this author in PubMed Google Scholar
Qianyue Bao
View author publications
You can also search for this author in PubMed Google Scholar
Fang Liu
View author publications
You can also search for this author in PubMed Google Scholar
LiCheng Jiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiahao Wang .

Editor information

Editors and Affiliations

IBM Research - MIT-IBM Watson AI Lab, Massachusetts, USA
Leonid Karlinsky
Technion – Israel Institute of Technology, Haifa, Israel
Tomer Michaeli
Kyoto University, Kyoto, Japan
Ko Nishino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, J. et al. (2023). A Three-Stage Model Fusion Method for Out-of-Distribution Generalization. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13806. Springer, Cham. https://doi.org/10.1007/978-3-031-25075-0_33

Download citation

DOI: https://doi.org/10.1007/978-3-031-25075-0_33
Published: 19 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25074-3
Online ISBN: 978-3-031-25075-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics