Multiscale Kiwifruit Detection from Digital Images

Xia, Yi; Nguyen, Minh; Lutui, Raymond; Yan, Wei Qi

doi:10.1007/978-981-97-0376-0_7

Yi Xia¹¹,
Minh Nguyen¹¹,
Raymond Lutui¹¹ &
…
Wei Qi Yan¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14403))

Included in the following conference series:

Pacific-Rim Symposium on Image and Video Technology

424 Accesses

Abstract

In this paper, we propose an improved YOLOv8-based Kiwifruit detection method using Swin Transformer, aiming to address challenges posed by significant scale variation and inaccuracies in multiscale object detection. Specifically, our approach embeds the encoder from Swin Transformer, based on its sliding-window design, into the YOLOv8 architecture to capture contextual information and global dependencies of the detected objects at multiple scales, facilitating the learning of semantic features. Through comparative experiments with the state-of-the-art object detection algorithms on our collected dataset, our proposed method demonstrates efficient detection of objects at different scales, significantly reducing false negatives while im-proving precision. Moreover, the method proves to be versatile in detecting objects of various sizes in different environmental settings, fulfilling the real-time requirements in complex and unknown Kiwifruit cultivation scenarios. The results highlight the potential practical applications of the pro-posed approach in Kiwifruit industry, showcasing its suitability for addressing real-world challenges and complexities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multi-stage progressive detection method for water deficit detection in vertical greenery plants

Article Open access 26 April 2024

Fine-grained recognition of bitter gourd maturity based on Improved YOLOv5-seg model

Article Open access 13 May 2024

Fast and accurate detection of kiwifruit in orchard using improved YOLOv3-tiny model

Article 13 September 2020

References

Carion, Nicolas, Massa, Francisco, Synnaeve, Gabriel, Usunier, Nicolas, Kirillov, Alexander, Zagoruyko, Sergey: End-to-end object detection with Transformers. In: Vedaldi, Andrea, Bischof, Horst, Brox, Thomas, Frahm, Jan-Michael. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Chapter Google Scholar
Fang, Y., et al.: You only look at one sequence: rethinking Transformer in vision through object detection. https://arxiv.org/abs/2106.00666
Ferguson, A.: 1904—the year that Kiwifruit (Actinidia deliciosa) came to New Zealand. N. Z. J. Crop. Hortic. Sci. 32, 3–27 (2004)
Article Google Scholar
Fu, Y., Nguyen, M., Yan, W.Q.: Grading methods for fruit freshness based on deep learning. SN Comput. Sci. 3 (2022)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Google Scholar
Gong, H., et al.: Swin-transformer-enabled YOLOv5 with attention mechanism for small object detection on satellite images. Remote Sens. 14, 2861 (2022)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38
Chapter Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional Neural Networks. Commun. ACM 60, 84–90 (2012)
Article Google Scholar
Law, H., Deng, J.: CornerNet: detecting objects as paired keypoints. Int. J. Comput. Vision 128, 642–656 (2019)
Article Google Scholar
Liu, Y., Nand, P., Hossain, M.A., Nguyen, M., Yan, W.Q.: Sign language recognition from digital videos using feature pyramid network with detection transformer. Multimedia Tools Appl. 82, 21673–21685 (2023)
Article Google Scholar
Liu, Y., Yang, G., Huang, Y., Yin, Y.: SE-Mask R-CNN: an improved Mask R-CNN for apple detection and segmentation. J. Intell. Fuzzy Syst. 41, 6715–6725 (2021)
Article Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: IEEE/CVF International Conference on Computer Vision (ICCV). (2021)
Google Scholar
Liu, Z., Yan, W., Yang, B.: Image denoising based on a CNN model. IEEE ICCAR (2018)
Google Scholar
Luo, Z., Yan, W.Q., Nguyen, M.: Kayak and sailboat detection based on the improved YOLO with transformer. In: International Conference on Control and Computer Vision (2022)
Google Scholar
Massah, J., AsefpourVakilian, K., Shabanian, M., Shariatmadari, S.: Design, development, and performance evaluation of a robot for yield estimation of Kiwifruit. Comput. Electron. Agric. 185, 106132 (2021)
Article Google Scholar
Pan, C., Liu, J., Yan, W., et al.: Salient object detection based on visual perceptual saturation and two-stream hybrid networks. IEEE Trans. Image Process. (2021)
Google Scholar
Pan, C., Yan, W.: A learning-based positive feedback in salient object detection. In: IEEE IVCNZ (2018)
Google Scholar
Pan, C., Yan, W.: Object detection based on saturation of visual perception. Multimedia Tools Appl. 79(27–28), 19925–19944 (2020)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: IEEE CVPR, pp. 779–788 (2016)
Google Scholar
Shen, D., Xin, C., Nguyen, M., Yan, W.: Flame detection using deep learning. In: IEEE ICCAR (2018)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wang, L., Yan, W.Q.: Tree leaves detection based on deep learning. In: International Symposium on Geometry and Vision, pp. 25–38 (2021)
Google Scholar
Xia, Y., Nguyen, M., Yan, W.Q.: A real-time Kiwifruit detection based on improved YOLOv7. In: Image and Vision Computing, pp. 48–61 (2023)
Google Scholar
Yan, W.Q.: Computational Methods for Deep Learning – Theory, Algorithms, and Implementations, 2nd edn. Springer, Singapore (2023). https://doi.org/10.1007/978-981-99-4823-9
Zhao, K., Yan, W.Q.: Fruit detection from digital images using CenterNet. In: Nguyen, M., Yan, W.Q., Ho, H. (eds.) ISGV 2021. CCIS, vol. 1386, pp. 313–326. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72073-5_24
Chapter Google Scholar
Yan, W.Q.: Introduction to Intelligent Surveillance, 3rd edn. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-10713-0
Xia, Y.: Kiwifruit Detection and Tracking from A Deep Learning Perspective Using Digital Videos. Master’s thesis, Auckland University of Technology, New Zealand (2023)
Google Scholar

Download references

Author information

Authors and Affiliations

Auckland University of Technology, 1010, Auckland, New Zealand
Yi Xia, Minh Nguyen, Raymond Lutui & Wei Qi Yan

Authors

Yi Xia
View author publications
You can also search for this author in PubMed Google Scholar
Minh Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Raymond Lutui
View author publications
You can also search for this author in PubMed Google Scholar
Wei Qi Yan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Xia .

Editor information

Editors and Affiliations

Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan
Auckland University of Technology, Auckland, New Zealand
Minh Nguyen
Auckland University of Technology, Auckland, New Zealand
Parma Nand
Auckland University of Technology, Auckland, New Zealand
Xuejun Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xia, Y., Nguyen, M., Lutui, R., Yan, W.Q. (2024). Multiscale Kiwifruit Detection from Digital Images. In: Yan, W.Q., Nguyen, M., Nand, P., Li, X. (eds) Image and Video Technology. PSIVT 2023. Lecture Notes in Computer Science, vol 14403. Springer, Singapore. https://doi.org/10.1007/978-981-97-0376-0_7

Download citation

DOI: https://doi.org/10.1007/978-981-97-0376-0_7
Published: 12 February 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0375-3
Online ISBN: 978-981-97-0376-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics