Skip to main content

Neural Substitution for Branch-Level Network Re-parameterization

  • Conference paper
  • First Online:
Computer Vision – ACCV 2024 (ACCV 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15479))

Included in the following conference series:

  • 92 Accesses

Abstract

We propose the neural substitution method for network re-parameterization at the branch-level connectivity. This method learns different network topologies to maximize the benefit of the ensemble effect, as re-parameterization allows for the integration of multiple layers during inference following their individual training. Additionally, we introduce a guiding method to incorporate non-linear activation functions into a linear transformation during re-parameterization. Because branch-level connectivity necessitates multiple non-linear activation functions, they must be infused into a single activation with our guided activation method during re-parameterization. Incorporating the non-linear activation function is significant because it overcomes the limitation of the current re-parameterization method, which only works at block-level connectivity. Restricting re-parameterization to block-level connectivity limits the use of network topology, making it challenging to learn a variety of feature representations. On the other hand, the proposed approach learns a considerably richer representation than existing methods due to the unlimited topology, with branch-level connectivity, providing a generalized framework to be applied with other methods. We provide comprehensive experimental evidence for the proposed re-parameterization approach. Our code is available at https://github.com/SoongE/neural_substitution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bossard, L., Guillaumin, M., Van Gool, L.: Food-101 – mining discriminative components with random forests. In: European Conference on Computer Vision (2014)

    Google Scholar 

  2. Cai, Z., Vasconcelos, N.: Cascade r-cnn: Delving into high quality object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  3. Cao, A.Q., Puy, G., Boulch, A., Marlet, R.: Pcam: Product of cross-attention matrices for rigid registration of point clouds. In: IEEE International Conference on Computer Vision (2021)

    Google Scholar 

  4. Cao, J., Li, Y., Sun, M., Chen, Y., Lischinski, D., Cohen-Or, D., Chen, B., Tu, C.: Do-conv: Depthwise over-parameterized convolutional layer. IEEE Transactions on Image Processing (2022)

    Google Scholar 

  5. Chen, S., Chen, Y., Yan, S., Feng, J.: Efficient differentiable neural architecture search with meta kernels. Arxiv (2019)

    Google Scholar 

  6. Chu, X., Li, L., Zhang, B.: Make repvgg greater again: A quantization-aware approach. In: Proceedings of the AAAI Conference on Artificial Intelligence (2024)

    Google Scholar 

  7. Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A.: Describing textures in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)

    Google Scholar 

  8. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)

    Google Scholar 

  9. Ding, X., Guo, Y., Ding, G., Han, J.: Acnet: Strengthening the kernel skeletons for powerful cnn via asymmetric convolution blocks. In: IEEE International Conference on Computer Vision (2019)

    Google Scholar 

  10. Ding, X., Xia, C., Zhang, X., Chu, X., Han, J., Ding, G.: Repmlp: Re-parameterizing convolutions into fully-connected layers for image recognition. Arxiv (2021)

    Google Scholar 

  11. Ding, X., Zhang, X., Han, J., Ding, G.: Diverse branch block: Building a convolution as an inception-like unit. In: IEEE Conference on Computer Vision and Pattern Recognition (2021)

    Google Scholar 

  12. Ding, X., Zhang, X., Han, J., Ding, G.: Scaling up your kernels to 31x31: Revisiting large kernel design in cnns. In: IEEE Conference on Computer Vision and Pattern Recognition (2022)

    Google Scholar 

  13. Ding, X., Zhang, X., Ma, N., Han, J., Ding, G., Sun, J.: Repvgg: Making vgg-style convnets great again. In: IEEE Conference on Computer Vision and Pattern Recognition (2021)

    Google Scholar 

  14. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. International Journal of Computer Vision (2010)

    Google Scholar 

  15. Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2net: A new multi-scale backbone architecture. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019)

    Google Scholar 

  16. Guo, S., Alvarez, J.M., Salzmann, M.: Expandnets: Linear over-parameterization to train compact convolutional networks. Neural Information Processing Systems (2020)

    Google Scholar 

  17. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: IEEE International Conference on Computer Vision (2017)

    Google Scholar 

  18. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)

    Google Scholar 

  19. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. Arxiv (2017)

    Google Scholar 

  20. Huang, T., You, S., Zhang, B., Du, Y., Wang, F., Qian, C., Xu, C.: Dyrep: Bootstrapping training with dynamic re-parameterization. In: IEEE Conference on Computer Vision and Pattern Recognition (2022)

    Google Scholar 

  21. Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning (2015)

    Google Scholar 

  22. Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3d object representations for fine-grained categorization. In: IEEE International Conference on Computer Vision Workshops (2013)

    Google Scholar 

  23. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto (2009)

    Google Scholar 

  24. Lee, Y., Kim, J., Willette, J., Hwang, S.J.: Mpvit: Multi-path vision transformer for dense prediction. In: IEEE Conference on Computer Vision and Pattern Recognition (2022)

    Google Scholar 

  25. Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., Ke, Z., Li, Q., Cheng, M., Nie, W., et al.: Yolov6: A single-stage object detection framework for industrial applications. Arxiv (2022)

    Google Scholar 

  26. Li, Z., Xiao, J., Yang, L., Gu, Q.: Repq-vit: Scale reparameterization for post-training quantization of vision transformers. In: IEEE International Conference on Computer Vision (2023)

    Google Scholar 

  27. Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: European Conference on Computer Vision (2014)

    Google Scholar 

  28. Liu, H., Simonyan, K., Yang, Y.: Darts: Differentiable architecture search. Arxiv (2018)

    Google Scholar 

  29. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)

    Google Scholar 

  30. Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. Arxiv (2013)

    Google Scholar 

  31. Ouyang, D., He, S., Zhang, G., Luo, M., Guo, H., Zhan, J., Huang, Z.: Efficient multi-scale attention module with cross-spatial learning. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (2023)

    Google Scholar 

  32. Parkhi, O.M., Vedaldi, A., Zisserman, A., Jawahar, C.V.: Cats and dogs. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)

    Google Scholar 

  33. Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning (2021)

    Google Scholar 

  34. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Neural Information Processing Systems (2015)

    Google Scholar 

  35. Ryu, J., Han, D., Lim, J.: Gramian attention heads are strong yet efficient vision learners. In: IEEE International Conference on Computer Vision (2023)

    Google Scholar 

  36. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)

    Google Scholar 

  37. Tang, K., Zhao, W., Peng, W., Fang, X., Cui, X., Zhu, P., Tian, Z.: Reparameterization head for efficient multi-input networks. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (2024)

    Google Scholar 

  38. Thomee, B., Shamma, D.A., Friedland, G., Elizalde, B., Ni, K., Poland, D., Borth, D., Li, L.J.: Yfcc100m: The new data in multimedia research. Communications of the ACM (2016)

    Google Scholar 

  39. Vasu, P.K.A., Gabriel, J., Zhu, J., Tuzel, O., Ranjan, A.: Fastvit: A fast hybrid vision transformer using structural reparameterization. In: IEEE International Conference on Computer Vision (2023)

    Google Scholar 

  40. Vasu, P.K.A., Gabriel, J., Zhu, J., Tuzel, O., Ranjan, A.: Mobileone: An improved one millisecond mobile backbone. In: IEEE Conference on Computer Vision and Pattern Recognition (2023)

    Google Scholar 

  41. Veit, A., Wilber, M.J., Belongie, S.: Residual networks behave like ensembles of relatively shallow networks. Neural Information Processing Systems (2016)

    Google Scholar 

  42. Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: IEEE Conference on Computer Vision and Pattern Recognition (2023)

    Google Scholar 

  43. Xiao, T., Liu, Y., Zhou, B., Jiang, Y., Sun, J.: Unified perceptual parsing for scene understanding. In: European Conference on Computer Vision (2018)

    Google Scholar 

  44. Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  45. Zhang, M., Yu, X., Rong, J., Ou, L.: Repnas: Searching for efficient re-parameterizing blocks. Arxiv (2021)

    Google Scholar 

  46. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)

    Google Scholar 

  47. Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. Arxiv (2016)

    Google Scholar 

Download references

Acknowledgements

This paper was supported in part by the Electronics and Telecommunications Research Institute (ETRI) Grant funded by Korean Government (Fundamental Technology Research for Human-Centric Autonomous Intelligent Systems) under Grant 24ZB1200, Institute of Information and Communications Technology Planning and Evaluation (IITP) grant funded by the Korea Government (MSIT) (Artificial Intelligence Innovation Hub) under Grant RS-2021-II212068, under the Artificial Intelligence Convergence Innovation Human Resources Development (IITP-2024-RS-2023-00255968), and the National Research Foundation of Korea (NRF) from the Korea Government (MSIT) under Grant RS-2024-00356486.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jongbin Ryu .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 197 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Oh, S., Ryu, J. (2025). Neural Substitution for Branch-Level Network Re-parameterization. In: Cho, M., Laptev, I., Tran, D., Yao, A., Zha, H. (eds) Computer Vision – ACCV 2024. ACCV 2024. Lecture Notes in Computer Science, vol 15479. Springer, Singapore. https://doi.org/10.1007/978-981-96-0966-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-96-0966-6_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-96-0965-9

  • Online ISBN: 978-981-96-0966-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics