Skip to main content

Advertisement

Dual-path decoder architecture for semantic segmentation of wheat ears

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In this study, a dual-path decoder segmentation network (DPDS) is presented, which innovatively introduces a dual-path structure into a semantic segmentation network incorporating atrous spatial pyramid pooling (ASPP). A novel loss function, boundary focal loss (BFLoss), is designed specifically for wheat ears segmentation scenarios, which adaptively adjusts weights for different pixel points through the binarization of boundary information, focusing the training on the edges of wheat ears. It is suggested to apply the DPDS network in conjunction with BFLoss to the semantic segmentation of wheat ears. The experimental results demonstrated that BFLoss possesses advantages over commonly used binary cross entropy loss (BCELoss) and focal loss in semantic segmentation. Additionally, the dual-path decoder architecture was proved to reach higher precision than activating only one of the pathways. In comparative experiments with established semantic segmentation networks, the DPDS model achieved the best performance on several evaluation metrics, and attained a balance between precision and recall. Notably, the combination of DPDS and BFLoss achieved a 91.86% F1 score on the wheat ears semantic segmentation test dataset. Therefore, the DPDS model can be effectively applied to semantic segmentation scenarios of crops like wheat, and also provides new insights for the improvement of existing networks. Code is available at https://github.com/awesome-pythoner/dual-path-decoder-segment.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability statement

Data will be made available on reasonable request.

References

  1. Gutierrez M, Reynolds MP, Klatt AR (2015) Effect of leaf and spike morphological traits on the relationship between spectral reflectance indices and yield in wheat. Int J Remote Sens 36(3):701–718

    Article  Google Scholar 

  2. Anderegg J, Yu K, Aasen H, Walter A, Liebisch F, Hund A (2020) Spectral vegetation indices to track senescence dynamics in diverse wheat germplasm. Front Plant Sci 10:1749

    Article  Google Scholar 

  3. Assadzadeh S, Walker CK, McDonald LS, Panozzo JF (2022) Prediction of milling yield in wheat with the use of spectral, colour, shape, and morphological features. Biosys Eng 214:28–41

    Article  Google Scholar 

  4. Barmeier G, Schmidhalter U (2017) High-throughput field phenotyping of leaves, leaf sheaths, culms and ears of spring barley cultivars at anthesis and dough ripeness. Front Plant Sci 8:1920

    Article  Google Scholar 

  5. Dandrifosse S, Ennadifi E, Carlier A, Gosselin B, Dumont B, Mercatoris B (2022) Deep learning for wheat ear segmentation and ear density measurement: from heading to maturity. Comput Electron Agric 199:107161

    Article  Google Scholar 

  6. Wang D, Zhang D, Yang G, Xu B, Luo Y, Yang X (2021) Ssrnet: in-field counting wheat ears using multi-stage convolutional neural network. IEEE Trans Geosci Remote Sens 60:1–11

    MATH  Google Scholar 

  7. Ma J, Li Y, Du K, Zheng F, Zhang L, Gong Z, Jiao W (2020) Segmenting ears of winter wheat at flowering stage using digital images and deep learning. Comput Electron Agric 168:105159

    Article  Google Scholar 

  8. Ennadifi E, Dandrifosse S, Mokhtari MEA, Carlier A, Laraba S, Mercatoris B, Gosselin B (2022) Local unsupervised wheat head segmentation. In: 2022 IEEE 18th international conference on Intelligent Computer Communication and Processing (ICCP). IEEE, pp 55–62

  9. Singh N, Tewari V, Biswas P, Dhruw L, Pareek C, Singh HD (2022) Semantic segmentation of in-field cotton bolls from the sky using deep convolutional neural networks. Smart Agricultural Technology 2:100045

    Article  MATH  Google Scholar 

  10. Ma J, Li Y, Liu H, Du K, Zheng F, Wu Y, Zhang L (2020) Improving segmentation accuracy for ears of winter wheat at flowering stage by semantic segmentation. Comput Electron Agric 176:105662

    Article  Google Scholar 

  11. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  12. Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495

    Article  MATH  Google Scholar 

  13. Zheng S, Lu J, Zhao H, Zhu X, Luo Z, Wang Y, Fu Y, Feng J, Xiang T, Torr PH et al (2021) Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6881–6890

  14. Chen J, Lu Y, Yu Q, Luo X, Adeli E, Wang Y, Lu L, Yuille AL, Zhou Y (2021) Transunet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306

  15. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2881–2890

  16. Grauman K, Darrell T (2005) The pyramid match kernel: Discriminative classification with sets of image features. In: Tenth IEEE international conference on computer vision (ICCV’05) Volume 1, vol 2. IEEE, pp 1458–1465

  17. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 2. IEEE, pp 2169–2178

  18. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Article  MATH  Google Scholar 

  19. Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  MATH  Google Scholar 

  20. Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587

  21. Luo W, Li Y, Urtasun R, Zemel R (2016) Understanding the effective receptive field in deep convolutional neural networks. Adv Neural Inf Process Syst 29

  22. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. Springer, pp 234–241

  23. Zhou Z, Rahman Siddiquee MM, Tajbakhsh N, Liang J (2018) Unet++: a nested u-net architecture for medical image segmentation. In: Deep learning in medical image analysis and multimodal learning for clinical decision support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4. Springer, pp 3–11

  24. Huang H, Lin L, Tong R, Hu H, Zhang Q, Iwamoto Y, Han X, Chen Y-W, Wu J (2020) Unet 3+: a full-scale connected unet for medical image segmentation. In: ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 1055–1059

  25. Abdollahi A, Pradhan B, Alamri A (2020) Vnet: an end-to-end fully convolutional neural network for road extraction from high-resolution remote sensing data. IEEE Access. 8:179424–179436

    Article  Google Scholar 

  26. Xiao X, Lian S, Luo Z, Li S (2018) Weighted res-unet for high-quality retina vessel segmentation. In: 2018 9th international conference on Information Technology in Medicine and Education (ITME). IEEE, pp 327–331

  27. Guan S, Khan AA, Sikdar S, Chitnis PV (2019) Fully dense unet for 2-d sparse photoacoustic tomography artifact removal. IEEE J Biomed Health Inform 24(2):568–576

    Article  Google Scholar 

  28. Ibtehaz N, Rahman MS (2020) Multiresunet: rethinking the u-net architecture for multimodal biomedical image segmentation. Neural Netw 121:74–87

    Article  Google Scholar 

  29. Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 801–818

  30. Poudel RP, Bonde U, Liwicki S, Zach C (2018) Contextnet: exploring context and detail for semantic segmentation in real-time. arXiv preprint arXiv:1805.04554

  31. Zhao H, Qi X, Shen X, Shi J, Jia J (2018) Icnet for real-time semantic segmentation on high-resolution images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 405–420

  32. Mazzini D (2018) Guided upsampling network for real-time semantic segmentation. arXiv preprint arXiv:1807.07466

  33. Yu C, Wang J, Peng C, Gao C, Yu G, Sang N (2018) Bisenet: bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 325–341

  34. Yu C, Gao C, Wang J, Yu G, Shen C, Sang N (2021) Bisenet v2: bilateral network with guided aggregation for real-time semantic segmentation. Int J Comput Vision 129:3051–3068

    Article  MATH  Google Scholar 

  35. Poudel RP, Liwicki S, Cipolla R (2019) Fast-scnn: fast semantic segmentation network. arXiv preprint arXiv:1902.04502

  36. Takikawa T, Acuna D, Jampani V, Fidler S (2019) Gated-scnn: gated shape cnns for semantic segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5229–5238

  37. Fan M, Lai S, Huang J, Wei X, Chai Z, Luo J, Wei X (2021) Rethinking bisenet for real-time semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9716–9725

  38. Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988

  39. Cui Y, Jia M, Lin T-Y, Song Y, Belongie S (2019) Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9268–9277

  40. Mukhoti J, Kulharia V, Sanyal A, Golodetz S, Torr P, Dokania P (2020) Calibrating deep neural networks using focal loss. Adv Neural Inf Process Syst 33:15288–15299

    Google Scholar 

  41. Li X, Wang W, Wu L, Chen S, Hu X, Li J, Tang J, Yang J (2020) Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection. Adv Neural Inf Process Syst 33:21002–21012

    Google Scholar 

  42. Abraham N, Khan NM (2019) A novel focal tversky loss function with improved attention u-net for lesion segmentation. In: 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019). IEEE, pp 683–687

  43. Sudre CH, Li W, Vercauteren T, Ourselin S, Jorge Cardoso M (2017) Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. In: Deep learning in medical image analysis and multimodal learning for clinical decision support: Third International Workshop, DLMIA 2017, and 7th International Workshop, ML-CDS 2017, Held in Conjunction with MICCAI 2017, Québec City, QC, Canada, September 14, Proceedings 3. Springer, pp 240–248

  44. Salehi SSM, Erdogmus D, Gholipour A (2017) Tversky loss function for image segmentation using 3d fully convolutional deep networks. In: International workshop on machine learning in medical imaging. Springer, pp 379–387

  45. Rahman MA, Wang Y (2016) Optimizing intersection-over-union in deep neural networks for image segmentation. In: International symposium on visual computing. Springer, pp 234–244

  46. Taghanaki SA, Zheng Y, Zhou SK, Georgescu B, Sharma P, Xu D, Comaniciu D, Hamarneh G (2019) Combo loss: handling input and output imbalance in multi-organ segmentation. Comput Med Imaging Graph 75:24–33

    Article  Google Scholar 

  47. Ma J, Chen J, Ng M, Huang R, Li Y, Li C, Yang X, Martel AL (2021) Loss odyssey in medical image segmentation. Med Image Anal 71:102035

    Article  MATH  Google Scholar 

  48. Hatamizadeh A, Tang Y, Nath V, Yang D, Myronenko A, Landman B, Roth HR, Xu D (2022) Unetr: transformers for 3d medical image segmentation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 574–584

  49. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  50. Zhang Y, Yang Q (2021) A survey on multi-task learning. IEEE Trans Knowl Data Eng 34(12):5586–5609

    Article  MATH  Google Scholar 

  51. Caruana R (1997) Multitask learning. Mach Learn 28:41–75

    Article  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the Primary Research & Development Plan of Jiangsu Province (BE2022389).

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study’s conception and design. System construction, data collection, and analysis were performed by Yu Chen and Lihui Wang. The first draft of the manuscript was written by Yu Chen. The revision was completed with the help of Lihui Wang, and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Lihui Wang.

Ethics declarations

Conflict of interest

None.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, L., Chen, Y. Dual-path decoder architecture for semantic segmentation of wheat ears. Appl Intell 55, 128 (2025). https://doi.org/10.1007/s10489-024-06023-7

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10489-024-06023-7

Keywords