Skip to main content

Multiscale Kiwifruit Detection from Digital Images

  • Conference paper
  • First Online:
Image and Video Technology (PSIVT 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14403))

Included in the following conference series:

  • 103 Accesses

Abstract

In this paper, we propose an improved YOLOv8-based Kiwifruit detection method using Swin Transformer, aiming to address challenges posed by significant scale variation and inaccuracies in multiscale object detection. Specifically, our approach embeds the encoder from Swin Transformer, based on its sliding-window design, into the YOLOv8 architecture to capture contextual information and global dependencies of the detected objects at multiple scales, facilitating the learning of semantic features. Through comparative experiments with the state-of-the-art object detection algorithms on our collected dataset, our proposed method demonstrates efficient detection of objects at different scales, significantly reducing false negatives while im-proving precision. Moreover, the method proves to be versatile in detecting objects of various sizes in different environmental settings, fulfilling the real-time requirements in complex and unknown Kiwifruit cultivation scenarios. The results highlight the potential practical applications of the pro-posed approach in Kiwifruit industry, showcasing its suitability for addressing real-world challenges and complexities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Carion, Nicolas, Massa, Francisco, Synnaeve, Gabriel, Usunier, Nicolas, Kirillov, Alexander, Zagoruyko, Sergey: End-to-end object detection with Transformers. In: Vedaldi, Andrea, Bischof, Horst, Brox, Thomas, Frahm, Jan-Michael. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13

    Chapter  Google Scholar 

  2. Fang, Y., et al.: You only look at one sequence: rethinking Transformer in vision through object detection. https://arxiv.org/abs/2106.00666

  3. Ferguson, A.: 1904—the year that Kiwifruit (Actinidia deliciosa) came to New Zealand. N. Z. J. Crop. Hortic. Sci. 32, 3–27 (2004)

    Article  Google Scholar 

  4. Fu, Y., Nguyen, M., Yan, W.Q.: Grading methods for fruit freshness based on deep learning. SN Comput. Sci. 3 (2022)

    Google Scholar 

  5. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

    Google Scholar 

  6. Gong, H., et al.: Swin-transformer-enabled YOLOv5 with attention mechanism for small object detection on satellite images. Remote Sens. 14, 2861 (2022)

    Article  Google Scholar 

  7. He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38

    Chapter  Google Scholar 

  8. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional Neural Networks. Commun. ACM 60, 84–90 (2012)

    Article  Google Scholar 

  9. Law, H., Deng, J.: CornerNet: detecting objects as paired keypoints. Int. J. Comput. Vision 128, 642–656 (2019)

    Article  Google Scholar 

  10. Liu, Y., Nand, P., Hossain, M.A., Nguyen, M., Yan, W.Q.: Sign language recognition from digital videos using feature pyramid network with detection transformer. Multimedia Tools Appl. 82, 21673–21685 (2023)

    Article  Google Scholar 

  11. Liu, Y., Yang, G., Huang, Y., Yin, Y.: SE-Mask R-CNN: an improved Mask R-CNN for apple detection and segmentation. J. Intell. Fuzzy Syst. 41, 6715–6725 (2021)

    Article  Google Scholar 

  12. Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: IEEE/CVF International Conference on Computer Vision (ICCV). (2021)

    Google Scholar 

  13. Liu, Z., Yan, W., Yang, B.: Image denoising based on a CNN model. IEEE ICCAR (2018)

    Google Scholar 

  14. Luo, Z., Yan, W.Q., Nguyen, M.: Kayak and sailboat detection based on the improved YOLO with transformer. In: International Conference on Control and Computer Vision (2022)

    Google Scholar 

  15. Massah, J., AsefpourVakilian, K., Shabanian, M., Shariatmadari, S.: Design, development, and performance evaluation of a robot for yield estimation of Kiwifruit. Comput. Electron. Agric. 185, 106132 (2021)

    Article  Google Scholar 

  16. Pan, C., Liu, J., Yan, W., et al.: Salient object detection based on visual perceptual saturation and two-stream hybrid networks. IEEE Trans. Image Process. (2021)

    Google Scholar 

  17. Pan, C., Yan, W.: A learning-based positive feedback in salient object detection. In: IEEE IVCNZ (2018)

    Google Scholar 

  18. Pan, C., Yan, W.: Object detection based on saturation of visual perception. Multimedia Tools Appl. 79(27–28), 19925–19944 (2020)

    Google Scholar 

  19. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: IEEE CVPR, pp. 779–788 (2016)

    Google Scholar 

  20. Shen, D., Xin, C., Nguyen, M., Yan, W.: Flame detection using deep learning. In: IEEE ICCAR (2018)

    Google Scholar 

  21. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  22. Wang, L., Yan, W.Q.: Tree leaves detection based on deep learning. In: International Symposium on Geometry and Vision, pp. 25–38 (2021)

    Google Scholar 

  23. Xia, Y., Nguyen, M., Yan, W.Q.: A real-time Kiwifruit detection based on improved YOLOv7. In: Image and Vision Computing, pp. 48–61 (2023)

    Google Scholar 

  24. Yan, W.Q.: Computational Methods for Deep Learning – Theory, Algorithms, and Implementations, 2nd edn. Springer, Singapore (2023). https://doi.org/10.1007/978-981-99-4823-9

  25. Zhao, K., Yan, W.Q.: Fruit detection from digital images using CenterNet. In: Nguyen, M., Yan, W.Q., Ho, H. (eds.) ISGV 2021. CCIS, vol. 1386, pp. 313–326. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72073-5_24

    Chapter  Google Scholar 

  26. Yan, W.Q.: Introduction to Intelligent Surveillance, 3rd edn. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-10713-0

  27. Xia, Y.: Kiwifruit Detection and Tracking from A Deep Learning Perspective Using Digital Videos. Master’s thesis, Auckland University of Technology, New Zealand (2023)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Xia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xia, Y., Nguyen, M., Lutui, R., Yan, W.Q. (2024). Multiscale Kiwifruit Detection from Digital Images. In: Yan, W.Q., Nguyen, M., Nand, P., Li, X. (eds) Image and Video Technology. PSIVT 2023. Lecture Notes in Computer Science, vol 14403. Springer, Singapore. https://doi.org/10.1007/978-981-97-0376-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-0376-0_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-0375-3

  • Online ISBN: 978-981-97-0376-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics