PAST-net: a swin transformer and path aggregation model for anthracnose instance segmentation

Wang, Yanxue; Wang, Shansong; Ni, Weijian; Zeng, Qingtian

doi:10.1007/s00530-022-01033-2

PAST-net: a swin transformer and path aggregation model for anthracnose instance segmentation

Regular Paper
Published: 20 December 2022

Volume 29, pages 1011–1023, (2023)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Yanxue Wang¹,
Shansong Wang²,
Weijian Ni² &
…
Qingtian Zeng¹

400 Accesses
2 Citations
Explore all metrics

Abstract

Anthracnose is a common disease that affects crops, and rapid diagnosis using computer vision technology can reduce economic losses. The convolutional neural network is still the mainstream crop disease detection and disease spot segmentation method. However, the method based on the convolutional neural network is not suitable for the detection and segmentation tasks with minor differences between crop lesions. In view of the instance characteristics that anthracnose lesions are mostly large in size, different in size, and regular in shape, the Path Aggregation Swin Transformer Network (PAST-Net), is proposed to achieve lesion segmentation and species detection of anthracnose simultaneously. First, the Swin Transformer is used as the backbone to extract features of input images. Second, the extracted lesion features are sequentially sent to the top-down feature pyramid network and the bottom-up augmentation path to retain the shallow network features to the greatest extent and improve the extraction ability of large-sized lesions. Next, the same proposal features from all levels are integrated using adaptive feature pooling. Finally, the box branch performs classification and bounding box regression, while the mask branch performs lesion segmentation. Experimental results show that PAST-Net improves the performance of both object detection and instance segmentation on the collected anthracnose dataset, with a recognition accuracy of 73.70% and a segmentation accuracy of 75.35%, which are 5.86% and 3.57% higher than the baseline, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MD-Unet for tobacco leaf disease spot segmentation based on multi-scale residual dilated convolutions

Article Open access 22 January 2025

Adaptive loss-guided multi-stage residual ASPP for lesion segmentation and disease detection in cucumber under complex backgrounds

Article Open access 08 August 2024

Lightweight fungal spore detection based on improved YOLOv5 in natural scenes

Article 28 November 2023

Data availability

Data are available from the authors upon request.

References

Oliveira Silva, Ad., Aliyeva-Schnorr, L., Wirsel, S.G.R., Deising, H.B.: Fungal pathogenesis-related cell wall biogenesis, with emphasis on the maize anthracnose fungus Colletotrichum graminicola. Plants 11(7), 849 (2022)
Article Google Scholar
Ma, J., Zheng, F., Zhang, L., Sun, Z., et al.: Disease recognition system for greenhouse cucumbers based on deep convolutional neural network. Trans. Chin. Soc. Agric. Eng. 34(12), 186–192 (2018)
Google Scholar
Wang, Z., Shi, Y., Li, Y.: Segmentation of corn leaf diseases based on improved fully convolutional neural network. Comput. Eng. Appl. 55(22), 127–132 (2019)
Google Scholar
Zhang, J., Kong, F., Wu, J., Zhai, Z., Han, S., Cao, S.: Cotton disease identification model based on improved VGG convolution neural network. J. China Agric. Univ. 23(11), 161–171 (2018)
Google Scholar
Ozguven, M.M., Adem, K.: Automatic detection and classification of leaf spot disease in sugar beet using deep learning algorithms. Phys. A 535, 122537 (2019)
Article Google Scholar
Kukreja, V., Dhiman, P.: A deep neural network based disease detection scheme for citrus fruits. In: 2020 International Conference on Smart Electronics and Communication (ICOSEC), pp. 97–101 (2020). IEEE
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1520–1528 (2015)
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 234–241 (2015)
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Article Google Scholar
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412.7062 (2014)
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Article Google Scholar
Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2961–2969 (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Kolesnikov, A., Dosovitskiy, A., Weissenborn, D., Heigold, G., Uszkoreit, J., Beyer, L., Minderer, M., Dehghani, M., Houlsby, N., Gelly, S.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Zheng, S., Lu, J., Zhao, H., Zhu, X., Luo, Z., Wang, Y., Fu, Y., Feng, J., Xiang, T., Torr, P.H.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6881–6890 (2021)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., Wang, M.: Swin-unet: Unet-like pure transformer for medical image segmentation. arXiv:abs/2105.05537 (2021)
Sun, Z., Liu, C., Qu, H., Xie, G.: A novel effective vehicle detection method based on swin transformer in hazy scenes. Mathematics (2022). https://doi.org/10.3390/math10132199
Article Google Scholar
Gao, Z., Wei, H., Guan, W., Nie, J., Wang, M., Chen, S.: A semantic-aware attention and visual shielding network for cloth-changing person re-identification (2022). https://doi.org/10.48550/arXiv.2207.08387
Zhao, Y., Zhang, H., Gao, Z., Guan, W., Nie, J., Liu, A., Wang, M., Chen, S.: A temporal-aware relation and attention network for temporal action localization. IEEE Trans. Image Process. 31, 4746–4760 (2022)
Article Google Scholar
Hu, H., Gu, J., Zhang, Z., Dai, J., Wei, Y.: Relation networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3588–3597 (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012)
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: a database and web-based tool for image annotation. Int. J. Comput. Vis. 77(1), 157–173 (2008)
Article Google Scholar
Ruder, S.: An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019)
Lee, Y., Hwang, J.-w., Lee, S., Bae, Y., Park, J.: An energy and gpu-computation efficient backbone network for real-time object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–0 (2019)

Download references

Acknowledgements

This work was supported in part by NSFC (U1931207 and 61702306), Sci. & Tech. Development Fund of Shandong Province of China (ZR2022MF288, ZR2017MF027 and ZR2022MF319), and the Taishan Scholar Program of Shandong Province.

Author information

Authors and Affiliations

College of Electronic and Information Engineering, Shandong University of Science and Technology, Xin’an Street, Qingdao, 266590, Shandong, China
Yanxue Wang & Qingtian Zeng
College of Computer Science and Engineering, Shandong University of Science and Technology, Xin’an Street, Qingdao, 266590, Shandong, China
Shansong Wang & Weijian Ni

Authors

Yanxue Wang
View author publications
You can also search for this author inPubMed Google Scholar
Shansong Wang
View author publications
You can also search for this author inPubMed Google Scholar
Weijian Ni
View author publications
You can also search for this author inPubMed Google Scholar
Qingtian Zeng
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding authors

Correspondence to Shansong Wang or Qingtian Zeng.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This paper contains no cases of studies with human participants performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, Y., Wang, S., Ni, W. et al. PAST-net: a swin transformer and path aggregation model for anthracnose instance segmentation. Multimedia Systems 29, 1011–1023 (2023). https://doi.org/10.1007/s00530-022-01033-2

Download citation

Received: 01 August 2022
Accepted: 28 November 2022
Published: 20 December 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s00530-022-01033-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PAST-net: a swin transformer and path aggregation model for anthracnose instance segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

MD-Unet for tobacco leaf disease spot segmentation based on multi-scale residual dilated convolutions

Adaptive loss-guided multi-stage residual ASPP for lesion segmentation and disease detection in cucumber under complex backgrounds

Lightweight fungal spore detection based on improved YOLOv5 in natural scenes

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now