Skip to main content
Log in

Self-distillation object segmentation via pyramid knowledge representation and transfer

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

The self-distillation methods can transfer the knowledge within the network itself to enhance the generalization ability of the network. However, due to the lack of spatially refined knowledge representations, current self-distillation methods can hardly be directly applied to object segmentation tasks. In this paper, we propose a novel self-distillation framework via pyramid knowledge representation and transfer for the object segmentation task. Firstly, a lightweight inference network is built to perform pixel-wise prediction rapidly. Secondly, a novel self-distillation method is proposed. To derive refined pixel-wise knowledge representations, the auxiliary self-distillation network via multi-level pyramid representation branches is built and appended to the inference network. A synergy distillation loss, which utilizes the top-down and consistency knowledge transfer paths, is presented to force more discriminative knowledge to be distilled into the inference network. Consequently, the performance of the inference network is improved. Experimental results on five datasets of object segmentation demonstrate that the proposed self-distillation method helps our inference network perform better segmentation effectiveness and efficiency than nine recent object segmentation network. Furthermore, the proposed self-distillation method outperforms typical self-distillation methods. The source code is publicly available at https://github.com/xfflyer/SKDforSegmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availibility statement

All data generated or analysed during this study are included in this published article. The datasets used in this paper are public.

References

  1. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)

  2. Huang, G., Liu, Z., Van, Der, Maaten, L., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4700–4708 (2017)

  3. Zagoruyko, S., Komodakis, N.: Wide residual networks. In: British Machine Vision Conference (BMVC) 2016, British Machine Vision Association (2016)

  4. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv:1503.02531 (2015)

  5. Furlanello, T., Lipton, Z., Tschannen, M., et al.: Born again neural networks. In: International Conference on Machine Learning (ICML), PMLR, pp. 1607–1616 (2018)

  6. Yang, C., Xie, L., Su, C., et al.: Snapshot distillation: teacher-student optimization in one generation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2859–2868 (2019)

  7. Ji, M., Shin, S., Hwang, S., et al.: Refine myself by teaching myself: Feature refinement via self-knowledge distillation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10,664–10,673 (2021)

  8. Hou, Y., Ma, Z., Liu, C., et al.: Learning lightweight lane detection cnns by self attention distillation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1013–1021 (2019)

  9. Zhang, L., Song, J., Gao, A., Chen, J., Bao, C., Ma, K.: Be your own teacher: improve the performance of convolutional neural networks via self distillation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 3713–3722 (2019)

  10. Sun, D., Yao, A., Zhou, A., Zhao, H.: Deeply-supervised knowledge synergy. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6997–7006 (2019)

  11. Li, D., Chen, Q.: Dynamic hierarchical mimicking towards consistent optimization objectives. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7642–7651 (2020)

  12. Zhang, L., Song, J., Bao, C., Ma, K.: Self-distillation: towards efficient and compact neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 44(8), 569–582 (2022)

    Google Scholar 

  13. Cui, Y., Yang, L.J., Liu, D., et al.: dynamic proposals for efficient object detection. arXiv:2104.13298 (2021)

  14. Yuan, L., Tay, F.E., Li, G., et al.: Revisiting knowledge distillation via label smoothing regularization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3903–3911 (2020)

  15. Yun, S., Park, J., Lee, K., et al.: Regularizing class-wise prediction via self-knowledge distillation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13876–13885 (2020)

  16. Ge, Y., Choi, C. L., Zhang, X., et al.: Self-distillation with batch knowledge ensembling improves imagenet classification. arXiv:2012.09816(2021)

  17. Xu, T.B., Liu, C.L.: Data-distortion guided self-distillation for deep neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 5565–5572 (2019)

  18. Lee, H., Hwang, S.J., Shin, J.: Self-supervised label augmentation via input transformations. In: International Conference on Machine Learning (ICML), PMLR, pp. 5714–5724 (2020)

  19. Bau, D., Zhou, B., Khosla, A., et al.: Network dissection: quantifying interpretability of deep visual representations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6541–6549 (2017)

  20. Liu, J.J., Hou, Q., Cheng, M.M., Feng, J., Jiang, J.: A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3917–3926 (2019)

  21. Wei, J., Wang, S., Huang, Q.: F3net: fusion, feedback and focus for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 12321–12328 (2020)

  22. Gao, S.H., Tan, Y.Q., Cheng, M.M., Lu, C., Chen, Y., Yan, S.: Highly efficient salient object detection with 100k parameters. In: European Conference on Computer Vision (ECCV), pp. 702–721 (2020)

  23. Wang, L., Chen, R., Zhu, L., et al.: Deep subregion network for salient object detection. IEEE Trans. Circuits Syst. Video Technol. 31(2), 728–741 (2020)

    Article  Google Scholar 

  24. Zhao, X., Pang, Y., Zhang, L., et al.: Suppress and balance: a simple gated network for salient object detection. In: European Conference on Computer Vision (ECCV). Springer, pp. 35–51 (2020)

  25. Zhao, H., Shi, J., Qi, X., et al.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2881–2890 (2017)

  26. He, D., Xie, C.: Semantic image segmentation algorithm in a deep learning computer network. Multim. Syst. 1–13 (2020)

  27. Fan, D.P., Ji, G.P., Sun, G., Cheng, M.M., Shen, J., Shao, L.: Camouflaged object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2777–2787 (2020)

  28. Zheng, Y.F., Zhang, X.W., Wang, F., Cao, T.Y., Sun, M., Wang, X.B.: Detection of people with camouflage pattern via dense deconvolution network. IEEE Signal Process. Lett. 26(1), 29–33 (2018)

    Article  Google Scholar 

  29. Huang, Z., Wang, X., Huang, L., Huang, C., Wei, Y., Liu, W.: Ccnet: Criss-cross attention for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 603-612(2019)

  30. Li, X., Zhong, Z., Wu, J., Yang, Y., Lin, Z., Liu, H.: Expectation-maximization attention networks for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 9167–9176 (2019)

  31. Li, X., Zhong, Z., Wu, J., Yang, Y., Lin, Z., Liu, H.: Residual dense network for image super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 9167–9176 (2019)

  32. Li, X., Zhong, Z., Wu, J., Yang, Y., Lin, Z., Liu, H.: Residual dense network for image restoration. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 9167–9176 (2019)

  33. Liu, D.F., Cui, Y.M., Yan, L.Q.: Densernet: weakly supervised visual localization using multiscale feature aggregation. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 35, vol. 7, pp. 6101–6109 (2021)

  34. Huang, Z.H., Li, W., Xia, X.G., et al.: A novel nonlocal-aware pyramid and multiscale multitask refinement detector for object detection in remote sensing images. IEEE Trans. Pattern Anal. Mach. Intell. 1–20 (2021)

  35. Cheng, M.M., Mitra, N.J., Huang, X., et al.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2019)

    Article  Google Scholar 

  36. Zhu, W., Liang, S., Wei, Y., et al.: Saliency optimization from robust background detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2814–2821 (2014)

  37. Li, N., Sun, B., Yu, J.: A weighted sparse coding framework for saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5216–5223 (2015)

  38. Romero, A., Ballas, N., Kahou, S.E., et al.: Fitnets: hints for thin deep nets. In: International Conference on Learning Representations (ICLR) (2015)

  39. Komodakis, N., Zagoruyko, S.: Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer. In: International Conference on Learning Representations (ICLR)(2017)

  40. Lee, S.H., Kim, D.H., Song, B.C.: Self-supervised knowledge distillation using singular value decomposition. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 335–350 (2018)

  41. Zhang, Y., Xiang, T., Hospedales, T.M., et al.: Deep mutual learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4320–4328 (2018)

  42. Chen, D., Mei, J.P., Wang, C., et al.: Online knowledge distillation with diverse peers. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pp. 3430–3437 (2020)

  43. Allen-Zhu, Z., Li, Y.: Towards understanding ensemble, knowledge distillation and self-distillation in deep learning. arXiv:2012.09816 (2020)

  44. Rebuffi, S.A., Fong, R., Ji, X., et al.: There and back again: Revisiting back-propagation saliency methods. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8839–8848 (2020)

  45. Woo, S., Park, J., Lee, J.Y., et al.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)

  46. Wang, Z., Bovik, A.C., Sheikh, H.R., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  47. Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.-H.: Saliency detection via graph based manifold ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3166–3173 (2013)

  48. Fan, D.P., Cheng, M.M., Liu, J.J., Gao, S.H., Hou, Q., Borji, A.: Salient objects in clutter: bringing salient object detection to the foreground. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 186–202 (2018)

  49. Cheng, M.M., Mitra, N.J., Huang, X., et al.: Salientshape: group saliency in image collections. Vis Comput 30(4), 443–453 (2014)

    Article  Google Scholar 

  50. Achanta, R., Hemami, S., Estrada, F., et al.: Frequency tuned salient region detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 1597–1604 (2009)

  51. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp 248–255 (2009)

  52. Wu, Z., Su, L., Huang, Q.: Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3907–3916 (2019)

  53. Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2net: a new multi-scale backbone architecture. IEEE Trans. Pattern Anal. Mach. Intell. 43(2), 652–662 (2021)

    Article  Google Scholar 

  54. Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach. Learn. 51(2), 181–207 (2003)

    Article  MATH  Google Scholar 

  55. Wu, Y.H., Liu, Y., Zhang, L.: Regularized densely-connected pyramid network for salient instance segmentation. IEEE Trans. Image Process. 30, 3897–3907 (2021)

    Article  Google Scholar 

Download references

Funding

This research was supported by the Natural Science Foundation of China (Nos. 61801512, 62071484) and Natural Science Foundation of Jiangsu Province (No. BK20180080).

Author information

Authors and Affiliations

Authors

Contributions

YZ contributed to the model designing and implementing, and paper writing. MS contributed to the model designing and paper writing. XW contributed to the data analysis and paper writing. TC contributed to the model designing and data analysis. XZ contributed to model designing and data analysis. LX contributed to the data analysis. ZF contributed to the model implementing and paper writing. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Tieyong Cao.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Ethical approval

Not applicable.

Consent for publication

Not applicable.

Consent to participate

Not applicable.

Code availability

The source code will be publicly available at https://github.com/xfflyer/SKDforSegmentation.

Additional information

Communicated by Y. Kong.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zheng, Y., Sun, M., Wang, X. et al. Self-distillation object segmentation via pyramid knowledge representation and transfer. Multimedia Systems 29, 2615–2631 (2023). https://doi.org/10.1007/s00530-023-01121-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-023-01121-x

Keywords

Navigation