CoG-Trans: coupled graph convolutional transformer for multi-label classification of cherry defects

Lin, Meiling; Li, Gongyan; Hao, Yuexing; Xu, Shaoyun

doi:10.1007/s00521-023-08521-0

CoG-Trans: coupled graph convolutional transformer for multi-label classification of cherry defects

Original Article
Published: 10 April 2023

Volume 35, pages 15365–15379, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Meiling Lin^1,2,
Gongyan Li¹,
Yuexing Hao¹ &
…
Shaoyun Xu¹

400 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Diverse categories of defects on the surface of the cherries have different influences on cherries’ quality, so simultaneous detection of these defects is essential for their grading. It is a difficult undertaking that requires to investigate the intrinsic category dependencies while taking the category imbalances into account. We treat cherry defect recognition as a multi-label classification task and present a novel identification network called Coupled Graph convolutional Transformer (CoG-Trans). Utilizing the self-attention mechanism and static co-occurrence patterns via our proposed categorical representation extraction Module, we model the relevance of various categories implicitly and explicitly, respectively. Moreover, we design a VI-Fusion module based on the attention mechanism to fuse the visible and infrared information sources. Additionally, we employ asymmetric-contrastive loss to correct the category imbalance and learn more discriminative features for each label. Our experiments are conducted on the VI-Cherry dataset, which consists of 9492 paired visible and infrared cherry images with six defective categories and one normal category manually annotated. The suggested method yields excellent performance compared to previous work, achieving 99.54% mAP on the VI-Cherry dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bi-deformation-UNet: recombination of differential channels for printed surface defect detection

Article 17 June 2022

A robust weakly supervised learning of deep Conv-Nets for surface defect inspection

Article 06 March 2020

A Weakly Supervised Defect Detection Based on Dual Path Networks and GMA-CAM

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data Availibility

Data will be made available on reasonable request.

References

Bujdosó G, Hrotkó K, Quero-Garcia J, Lezzoni A, Puławska J, Lang G (2017) Cherry production. In: Quero-Garcia J, Iezzoni A, Pulawska J, Lang G (eds) Cherries: botany, production and uses. Cabi, pp 1–13
Google Scholar
Ali MA, Thai KW (2017) Automated fruit grading system. In: 2017 IEEE 3rd International Symposium in Robotics and Manufacturing Automation (ROMA), pp 1–6. IEEE
Naik S, Patel B (2017) Machine vision based fruit classification and grading—a review. Int J Comput Appl 170(9):22–34
Google Scholar
Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: a survey. Comput Electron Agric 147:70–90
Article Google Scholar
Dubey SR, Jalal A (2012) Robust approach for fruit and vegetable classification. Proc Eng 38:3449–3453
Article Google Scholar
Hartigan JA, Wong MA (1979) Algorithm as 136: a k-means clustering algorithm. J R Stat Soc Ser C Appl Stat 28(1):100–108
MATH Google Scholar
Vapnik V (1999) The nature of statistical learning theory. Springer
Zawbaa HM, Hazman M, Abbass M, Hassanien AE (2014) Automatic fruit classification using random forest algorithm. In: 2014 14th International Conference on Hybrid Intelligent Systems, pp 164–168
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Article MATH Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Article MathSciNet MATH Google Scholar
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: Icml, vol 96, pp 148–156. Citeseer
Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32(9):1627–1645
Article Google Scholar
Biswas B, Ghosh SK, Ghosh A (2020) A robust multi-label fruit classification based on deep convolution neural network. Springer
Alajrami MA, Abu-Naser SS (2020) Type of tomato classification using deep learning. Int J Acad Pedagogical Res (IJAPR) 3(12)
Leemans V, Magein H, Destain M-F (2002) Ae-automation and emerging technologies: on-line fruit grading according to their external quality using machine vision. Biosyst Eng 83(4):397–404
Article Google Scholar
Balestani A, Moghaddam P, Motlaq A, Dolaty H (2012) Sorting and grading of cherries on the basis of ripeness, size and defects by using image processing techniques. Int J Agric Crop Sci (IJACS) 4(16):1144–1149
Google Scholar
Sun X, Ma L, Li G (2019) Multi-vision attention networks for on-line red jujube grading. Chin J Electron 28(6):1108–1117
Article Google Scholar
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4700–4708
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141
Momeny M, Jahanbakhshi A, Jafarnezhad K, Zhang Y-D (2020) Accurate classification of cherry fruit using deep CNN based on hybrid pooling approach. Postharvest Biol Technol 166:111204
Article Google Scholar
Elman JL (1990) Finding structure in time. Cogn Sci 14(2):179–211
Article Google Scholar
Li H, Wu X-J, Durrani T (2020) Nestfuse: an infrared and visible image fusion architecture based on nest connection and spatial/channel attention models. IEEE Trans Instrum Meas 69(12):9645–9656
Article Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Wei Y, Xia W, Lin M, Huang J, Ni B, Dong J, Zhao Y, Yan S (2015) Hcp: a flexible CNN framework for multi-label image classification. IEEE Trans Pattern Anal Mach Intell 38(9):1901–1907
Article Google Scholar
Wang J, Yang Y, Mao J, Huang Z, Huang C, Xu W (2016) Cnn-rnn: A unified framework for multi-label image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2285–2294 (2016)
Chen S-F, Chen Y-C, Yeh C-K, Wang Y-C (2018) Order-free rnn with visual attention for multi-label classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 32
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Chen Z-M, Wei X-S, Wang P, Guo Y (2019) Multi-label image recognition with graph convolutional networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 5177–5186
Chen T, Xu M, Hui X, Wu H, Lin L (2019) Learning semantic-specific graph representation for multi-label image recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 522–531
Xu J, Tian H, Wang Z, Wang Y, Kang W, Chen F (2020) Joint input and output space learning for multi-label image classification. IEEE Trans Multimed 23:1696–1707
Article Google Scholar
You R, Guo Z, Cui L, Long X, Bao Y, Wen S (2020) Cross-modality attention with semantic graph embedding for multi-label classification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 12709–12716
Wang Y, He D, Li F, Long X, Zhou Z, Ma J, Wen S (2020) Multi-label classification with label graph superimposing. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 12265–12272
Ye J, He J, Peng X, Wu W, Qiao Y (2020) Attention-driven dynamic graph convolutional network for multi-label image recognition. In: European Conference on Computer Vision, pp 649–665. Springer
Lanchantin J, Wang T, Ordonez V, Qi Y (2021) General multi-label image classification with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16478–16488
Zhao J, Yan K, Zhao, Y, Guo X, Huang F, Li J (2021) Transformer-based dual relation graph for multi-label image recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 163–172
Zhao J, Zhao Y, Li J (2021) M3tr: multi-modal multi-label recognition with transformer. In: Proceedings of the 29th ACM International Conference on Multimedia, pp 469–477
Cheng X, Lin H, Wu X, Yang F, Shen D, Wang Z, Shi N, Liu H (2021) Mltr: Multi-label classification with transformer. arXiv preprint arXiv:2106.06195
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, et al (2020) An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European Conference on Computer Vision, pp 213–229. Springer
Liu S, Zhang L, Yang X, Su H, Zhu J (2021) Query2label: a simple transformer way to multi-label classification. arXiv preprint arXiv:2107.10834
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
Pennington J, Socher R, Manning CD (2014) Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp 1532–1543
Lin, T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2980–2988
Ridnik T, Ben-Baruch E, Zamir N, Noy A, Friedman I, Protter M, Zelnik-Manor L (2021) Asymmetric loss for multi-label classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 82–91
He J, Chen J-N, Liu S, Kortylewski A, Yang C, Bai Y, Wang C, Yuille A (2021) Transfg: A transformer architecture for fine-grained recognition. arXiv preprint arXiv:2103.07976
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256. JMLR Workshop and Conference Proceedings
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Smith LN (2018) A disciplined approach to neural network hyper-parameters: Part 1–learning rate, batch size, momentum, and weight decay. arXiv preprint arXiv:1803.09820
Loshchilov I, Hutter F (2017) Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, et al (2019) Pytorch: An imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32

Download references

Acknowledgements

This work was supported by Chinese Academy of Sciences Engineering Laboratory for Intelligent Logistics Equipment System (No. KFJ-PTXM-025).

Author information

Authors and Affiliations

Institute of Microelectronics, Chinese Academy of Sciences, Beijing, 100029, China
Meiling Lin, Gongyan Li, Yuexing Hao & Shaoyun Xu
University of Chinese Academy of Sciences, Beijing, 100049, China
Meiling Lin

Authors

Meiling Lin
View author publications
You can also search for this author inPubMed Google Scholar
Gongyan Li
View author publications
You can also search for this author inPubMed Google Scholar
Yuexing Hao
View author publications
You can also search for this author inPubMed Google Scholar
Shaoyun Xu
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yuexing Hao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lin, M., Li, G., Hao, Y. et al. CoG-Trans: coupled graph convolutional transformer for multi-label classification of cherry defects. Neural Comput & Applic 35, 15365–15379 (2023). https://doi.org/10.1007/s00521-023-08521-0

Download citation

Received: 08 August 2022
Accepted: 21 March 2023
Published: 10 April 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s00521-023-08521-0

Keywords

Profiles

Meiling Lin View author profile

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

CoG-Trans: coupled graph convolutional transformer for multi-label classification of cherry defects

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Bi-deformation-UNet: recombination of differential channels for printed surface defect detection

A robust weakly supervised learning of deep Conv-Nets for surface defect inspection

A Weakly Supervised Defect Detection Based on Dual Path Networks and GMA-CAM

Explore related subjects

Data Availibility

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Profiles

Subscribe and save

Buy Now