Dual-path decoder architecture for semantic segmentation of wheat ears

Wang, Lihui; Chen, Yu

doi:10.1007/s10489-024-06023-7

Dual-path decoder architecture for semantic segmentation of wheat ears

Published: 11 December 2024

Volume 55, article number 128, (2025)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

118 Accesses
Explore all metrics

Abstract

In this study, a dual-path decoder segmentation network (DPDS) is presented, which innovatively introduces a dual-path structure into a semantic segmentation network incorporating atrous spatial pyramid pooling (ASPP). A novel loss function, boundary focal loss (BFLoss), is designed specifically for wheat ears segmentation scenarios, which adaptively adjusts weights for different pixel points through the binarization of boundary information, focusing the training on the edges of wheat ears. It is suggested to apply the DPDS network in conjunction with BFLoss to the semantic segmentation of wheat ears. The experimental results demonstrated that BFLoss possesses advantages over commonly used binary cross entropy loss (BCELoss) and focal loss in semantic segmentation. Additionally, the dual-path decoder architecture was proved to reach higher precision than activating only one of the pathways. In comparative experiments with established semantic segmentation networks, the DPDS model achieved the best performance on several evaluation metrics, and attained a balance between precision and recall. Notably, the combination of DPDS and BFLoss achieved a 91.86% F1 score on the wheat ears semantic segmentation test dataset. Therefore, the DPDS model can be effectively applied to semantic segmentation scenarios of crops like wheat, and also provides new insights for the improvement of existing networks. Code is available at https://github.com/awesome-pythoner/dual-path-decoder-segment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CT image segmentation of foxtail millet seeds based on semantic segmentation model VGG16-UNet

Article Open access 07 November 2024

An accurate semantic segmentation model for bean seedlings and weeds identification based on improved ERFnet

Article Open access 29 May 2024

Wheat-Net: An Automatic Dense Wheat Spike Segmentation Method Based on an Optimized Hybrid Task Cascade Model

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability statement

Data will be made available on reasonable request.

References

Gutierrez M, Reynolds MP, Klatt AR (2015) Effect of leaf and spike morphological traits on the relationship between spectral reflectance indices and yield in wheat. Int J Remote Sens 36(3):701–718
Article Google Scholar
Anderegg J, Yu K, Aasen H, Walter A, Liebisch F, Hund A (2020) Spectral vegetation indices to track senescence dynamics in diverse wheat germplasm. Front Plant Sci 10:1749
Article Google Scholar
Assadzadeh S, Walker CK, McDonald LS, Panozzo JF (2022) Prediction of milling yield in wheat with the use of spectral, colour, shape, and morphological features. Biosys Eng 214:28–41
Article Google Scholar
Barmeier G, Schmidhalter U (2017) High-throughput field phenotyping of leaves, leaf sheaths, culms and ears of spring barley cultivars at anthesis and dough ripeness. Front Plant Sci 8:1920
Article Google Scholar
Dandrifosse S, Ennadifi E, Carlier A, Gosselin B, Dumont B, Mercatoris B (2022) Deep learning for wheat ear segmentation and ear density measurement: from heading to maturity. Comput Electron Agric 199:107161
Article Google Scholar
Wang D, Zhang D, Yang G, Xu B, Luo Y, Yang X (2021) Ssrnet: in-field counting wheat ears using multi-stage convolutional neural network. IEEE Trans Geosci Remote Sens 60:1–11
MATH Google Scholar
Ma J, Li Y, Du K, Zheng F, Zhang L, Gong Z, Jiao W (2020) Segmenting ears of winter wheat at flowering stage using digital images and deep learning. Comput Electron Agric 168:105159
Article Google Scholar
Ennadifi E, Dandrifosse S, Mokhtari MEA, Carlier A, Laraba S, Mercatoris B, Gosselin B (2022) Local unsupervised wheat head segmentation. In: 2022 IEEE 18th international conference on Intelligent Computer Communication and Processing (ICCP). IEEE, pp 55–62
Singh N, Tewari V, Biswas P, Dhruw L, Pareek C, Singh HD (2022) Semantic segmentation of in-field cotton bolls from the sky using deep convolutional neural networks. Smart Agricultural Technology 2:100045
Article MATH Google Scholar
Ma J, Li Y, Liu H, Du K, Zheng F, Wu Y, Zhang L (2020) Improving segmentation accuracy for ears of winter wheat at flowering stage by semantic segmentation. Comput Electron Agric 176:105662
Article Google Scholar
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article MATH Google Scholar
Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6881–6890
Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y (2021) Transunet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306
Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890
Grauman K, Darrell T (2005) The pyramid match kernel: Discriminative classification with sets of image features. In: Tenth IEEE international conference on computer vision (ICCV’05) Volume 1, vol 2. IEEE, pp 1458–1465
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 2. IEEE, pp 2169–2178
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Article MATH Google Scholar
Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article MATH Google Scholar
Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587
Luo W, Li Y, Urtasun R, Zemel R (2016) Understanding the effective receptive field in deep convolutional neural networks. Adv Neural Inf Process Syst 29
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, pp 234–241
Zhou Z, Rahman Siddiquee MM, Tajbakhsh N, Liang J (2018) Unet++: a nested u-net architecture for medical image segmentation. In: Deep learning in medical image analysis and multimodal learning for clinical decision support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4. Springer, pp 3–11
Huang H, Lin L, Tong R, Hu H, Zhang Q, Iwamoto Y, Han X, Chen Y-W, Wu J (2020) Unet 3+: a full-scale connected unet for medical image segmentation. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 1055–1059
Abdollahi A, Pradhan B, Alamri A (2020) Vnet: an end-to-end fully convolutional neural network for road extraction from high-resolution remote sensing data. IEEE Access. 8:179424–179436
Article Google Scholar
Xiao X, Lian S, Luo Z, Li S (2018) Weighted res-unet for high-quality retina vessel segmentation. In: 2018 9th international conference on Information Technology in Medicine and Education (ITME). IEEE, pp 327–331
Guan S, Khan AA, Sikdar S, Chitnis PV (2019) Fully dense unet for 2-d sparse photoacoustic tomography artifact removal. IEEE J Biomed Health Inform 24(2):568–576
Article Google Scholar
Ibtehaz N, Rahman MS (2020) Multiresunet: rethinking the u-net architecture for multimodal biomedical image segmentation. Neural Netw 121:74–87
Article Google Scholar
Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 801–818
Poudel RP, Bonde U, Liwicki S, Zach C (2018) Contextnet: exploring context and detail for semantic segmentation in real-time. arXiv preprint arXiv:1805.04554
Zhao H, Qi X, Shen X, Shi J, Jia J (2018) Icnet for real-time semantic segmentation on high-resolution images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 405–420
Mazzini D (2018) Guided upsampling network for real-time semantic segmentation. arXiv preprint arXiv:1807.07466
Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Bisenet: bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 325–341
Yu C, Gao C, Wang J, Yu G, Shen C, Sang N (2021) Bisenet v2: bilateral network with guided aggregation for real-time semantic segmentation. Int J Comput Vision 129:3051–3068
Article MATH Google Scholar
Poudel RP, Liwicki S, Cipolla R (2019) Fast-scnn: fast semantic segmentation network. arXiv preprint arXiv:1902.04502
Takikawa T, Acuna D, Jampani V, Fidler S (2019) Gated-scnn: gated shape cnns for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5229–5238
Fan M, Lai S, Huang J, Wei X, Chai Z, Luo J, Wei X (2021) Rethinking bisenet for real-time semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9716–9725
Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988
Cui Y, Jia M, Lin T-Y, Song Y, Belongie S (2019) Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9268–9277
Mukhoti J, Kulharia V, Sanyal A, Golodetz S, Torr P, Dokania P (2020) Calibrating deep neural networks using focal loss. Adv Neural Inf Process Syst 33:15288–15299
Google Scholar
Li X, Wang W, Wu L, Chen S, Hu X, Li J, Tang J, Yang J (2020) Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection. Adv Neural Inf Process Syst 33:21002–21012
Google Scholar
Abraham N, Khan NM (2019) A novel focal tversky loss function with improved attention u-net for lesion segmentation. In: 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019). IEEE, pp 683–687
Sudre CH, Li W, Vercauteren T, Ourselin S, Jorge Cardoso M (2017) Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In: Deep learning in medical image analysis and multimodal learning for clinical decision support: Third International Workshop, DLMIA 2017, and 7th International Workshop, ML-CDS 2017, Held in Conjunction with MICCAI 2017, Québec City, QC, Canada, September 14, Proceedings 3. Springer, pp 240–248
Salehi SSM, Erdogmus D, Gholipour A (2017) Tversky loss function for image segmentation using 3d fully convolutional deep networks. In: International workshop on machine learning in medical imaging. Springer, pp 379–387
Rahman MA, Wang Y (2016) Optimizing intersection-over-union in deep neural networks for image segmentation. In: International symposium on visual computing. Springer, pp 234–244
Taghanaki SA, Zheng Y, Zhou SK, Georgescu B, Sharma P, Xu D, Comaniciu D, Hamarneh G (2019) Combo loss: handling input and output imbalance in multi-organ segmentation. Comput Med Imaging Graph 75:24–33
Article Google Scholar
Ma J, Chen J, Ng M, Huang R, Li Y, Li C, Yang X, Martel AL (2021) Loss odyssey in medical image segmentation. Med Image Anal 71:102035
Article MATH Google Scholar
Hatamizadeh A, Tang Y, Nath V, Yang D, Myronenko A, Landman B, Roth HR, Xu D (2022) Unetr: transformers for 3d medical image segmentation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 574–584
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Zhang Y, Yang Q (2021) A survey on multi-task learning. IEEE Trans Knowl Data Eng 34(12):5586–5609
Article MATH Google Scholar
Caruana R (1997) Multitask learning. Mach Learn 28:41–75
Article MATH Google Scholar

Download references

Acknowledgements

This work was supported by the Primary Research & Development Plan of Jiangsu Province (BE2022389).

Author information

Authors and Affiliations

Key Laboratory of Micro-inertial Instrument and Advanced Navigation Technology, Ministry of Education, School of Instrument Science and Engineering, Southeast University, Nanjing, 210096, Jiangsu, People’s Republic of China
Lihui Wang & Yu Chen

Authors

Lihui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yu Chen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study’s conception and design. System construction, data collection, and analysis were performed by Yu Chen and Lihui Wang. The first draft of the manuscript was written by Yu Chen. The revision was completed with the help of Lihui Wang, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Lihui Wang.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, L., Chen, Y. Dual-path decoder architecture for semantic segmentation of wheat ears. Appl Intell 55, 128 (2025). https://doi.org/10.1007/s10489-024-06023-7

Download citation

Accepted: 18 September 2024
Published: 11 December 2024
DOI: https://doi.org/10.1007/s10489-024-06023-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dual-path decoder architecture for semantic segmentation of wheat ears

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

CT image segmentation of foxtail millet seeds based on semantic segmentation model VGG16-UNet

An accurate semantic segmentation model for bean seedlings and weeds identification based on improved ERFnet

Wheat-Net: An Automatic Dense Wheat Spike Segmentation Method Based on an Optimized Hybrid Task Cascade Model

Explore related subjects

Data availability statement

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation