Abstract
With the escalating demand for precise coal gangue detection in industrial applications, the development of efficient and robust object detection architectures is imperative. In this study, we introduce L-DEYO, an innovative lightweight deep learning model specifically designed for intelligent coal gangue recognition in resource-constrained environments. L-DEYO integrates a hierarchical feature extraction mechanism with a progressive training paradigm, optimizing both detection accuracy and inference speed without necessitating additional annotated datasets. Comprehensive evaluations reveal that L-DEYO attains a mean average precision (mAP) of 37.6% on the COCO benchmark, achieving real-time performance at 497 frames per second (FPS) on an NVIDIA Tesla T4 GPU. Notably, the model's modular design facilitates efficient training on a single 8 GB RTX 4060 GPU, resulting in a substantial reduction in computational overhead. These findings underscore L-DEYO's efficacy and scalability, offering a viable solution for large-scale deployment in industrial coal gangue detection systems. Access to our method is available at https://github.com/srcuyan/L-DEYO.git.










Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data availability
No datasets were generated or analysed during the current study.
References
Yang, M., Guo, Z., Deng, Y., Xing, X., Qiu, K., Long, J., Li, J.: Preparation of cao–al2o3–sio2 glass ceramics from coal gangue. Int. J. Mineral Pro-cessing 102, 112–115 (2012)
Xue, B., Zhang, Y., Li, J., Wang, Y.: A review of coal gangue identification research—application to china’s top coal release process. Env-iron. Sci. Pollution Res 30(6), 14091–14103 (2023)
Qi, X., Zhang, Y.: The automatic sorting machine of gangue operated by fuzzy mode reco-gnition and fuzzy control. Process Autom. Instrum. 21(12), 20–22 (2002)
Li, M., Duan, Y., Cao, X., Liu, C., Sun, K., Liu, H.: Image identification method and system for coal and gangue sorting robot. J. China Coal Soc. 45(10), 3636–3644 (2020)
Hu, F., Zhou, M., Yan, P., Bian, K., Dai, R.: Mul-tispectral imaging: A new solution for identification of coal and gangue. IEEE Access 7(169697–169704), 2 (2019)
L. Su, X. Cao, H. Ma, and Y. Li. Research on coal gangue identification by using convolutional neural network. In 2018 2nd IEEE Advanced Information Management, Communications, Electronic and Au-tomation Control Conference (IMCEC), pages 810–814. IEEE, 2018. 2
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016. 2
X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai. Deformable detr: Deformable transformers for end-to-end object detection. In Proceedings of the Inter-national Conferenceon Learning Representations (ICLR), 2021. 2
Wang, L., et al.: Eapt: Efficient attention pyramid tran-sformer for image processing. IEEE Trans. Multimedia 25(50–61), 2 (2021)
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. You only look once: Unified real-time object detec-tion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 779–788, 2016. 2,3,4
Liu, Q., Li, J., Li, Y., Gao, M.: Recognition met-hods for coal and coal gangue based on deep learning. IEEE Access 9(77599–77610), 2 (2021)
Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arX-iv:2004.10934, 2020. 2, 3
T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 2980–2988, 2017. 2,3
Liu, Y., Wang, M.: Research on lightweight al-gorithm for gangue detection based on improved yo-lov5. Scientific. Rep. 13(8), 2345–2356 (2024)
Zhang, W., Chen, Li.: The real-time detection method for coal gangue based on yolov8s-gsc. J. Real-Time. Image. Processing. 18(5), 567–576 (2024)
Zhang, J., et al.: Bagfn: Broad attention graph fusion network for high-order feature interactions. IEEE. Transact. Neural. Networks. Learning. Sy-st. 34(8), 4499–4513 (2021)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Za-goruyko, S.: End-to- end object detection with transf-ormers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) European Conferenceon Computer Vision, pp. 213–229. Springer, Cham (2020)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In Proceed. Ad-vances Neural Inform. Processing Syst. (NeurIPS) 30, 2 (2017)
Z. Liu, Y. Lin Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proc-eedings of the IEEE International Conference on Co-mputer Vision (ICCV), pages 10012–10022, 2021. 2,4
Zhang, M., Tian, X.: Transformer architecture based on mutual attention for imageanomaly dete-ction. Virtual. Reality. Intell. Hardware. 5(1), 57–67 (2023)
Chen, Y., et al.: Ilidviz: An incremental learning-based visual analysis system for network anomaly detection. IEEE. Transact. Visualization. Comput. Graphics. 28(1), 1038–1048 (2022)
HaodongOuyang. Deyo: Detr with yolo for end-to-end object detection. arXiv preprint arXiv:2402.16370 (2024)
Wei Liu, Dragomir Anguelov, Dumitru Erhan, Ch-ristian Szegedy, Scott Reed, Cheng-Yang Fu, and Ale-xander C Berg. Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pages 21–37. Springer, (2016)
Joseph Redmon and Ali Farhadi. Yolo9000: better, faster, stronger. In Proceedings of the IEEE con-ference on computer vision and pattern recognition, pages 7263–7271 (2017)
Joseph Redmon. Yolov3: An incremental improve-ment. arXiv preprint arXiv:1804.02767 (2018)
Glenn Jocher. Yolov5. https://github.com/ ultralytics/-yolov5, 2020. Accessed 2024–09–06.3
Chien-Yao Wang, I-Hau Yeh, and Hong-Yuan Mark Liao. You only learn one representation: Unified ne-twork for multipletasks. arXiv preprint arXiv:2105.04206 (2021)
Chien-Yao Wang, Alexey Bochkovskiy, and Hong-Yuan Mark Liao. Yolov7: Trainable bag-of-free-bies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7464–7475 (2023)
Z Ge. Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430, (2021)
Glenn Jocher.Yolov8. https://github.com/ultralytics/-yolov8, 2023. Accessed 2024–09–06. 3
Zhu, X., Weijie, Su., Lewei, Lu., Li, B., Wang, X., Dai, J.: Deformable detr: Deformable transformers for end-to-end object detection. In. Int. Conf. Learning Representations 3(4), 6 (2020)
Peize Sun, Jinkun Cao, Yi Jiang, Rufeng Zhang, Ping Luo, Jifeng Dai, and Xiaogang Li. Motr: End-to-end multiple- object tracking with transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10316–10325, (2021)
Depu Meng, Xiaokang Chen, Zejia Fan Gang Zeng, Houqiang Li, Yuhui Yuan, and Lei Sun. Conditional detr for fast training convergence. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3651– 3660 (2021)
Xinlong Wang Rufeng Li, Yue Zhang, Zhuang Liu, and Stephen Lin. Pnp-detr: Towards efficient visual an-alysis with transformers. arXiv preprint arXiv:2206.12071, (2022)
Zhigang Dai, Bolun Cai, Yugeng Lin, and Junying Chen. Up-detr: Unsupervised pretraining for object detection with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1601– 1610 (2021)
Yukun Wang, Hongwei Zhu, Hartwig Adam, Alan Yuille, and Liang-Chieh Chen. Max-deeplab: End-to-end panoptic segmentation with mask transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5463–5474 (2021)
Bowen Cheng, Alexander Schwing, and Alexander Kirillov. Masked-attention mask transformer for universal image segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1290–1299 (2021)
Alexey Bochkovskiy, Chien-Yao Wang, and Hong- Yuan Mark Liao. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934, (2020)
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual representation learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9729–9738 (2020)
Alexey Dosovitskiy, Lucas Beyer, Alexander Kole-snikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (2021)
Shifeng Zhang, Cheng Chi, Yongqiang Yao, Zhen Lei, and Stan Z. Li. Bridging the gap between anchor-based and anchor-free detection via adaptive training samplese-lection. Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 9759–9768 (2020)
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko-reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in Neural Information Processing Systems, pages 5998–6008 (2017)
Mingxing Tan and Quoc Le. Efficientnet: Rethinking model scaling for convolutional neural networks. Proceedings of the International Conference on Machine Learning, pages 6105–6114 (2019)
Tsung-YiLin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. Focal oss for dense object detection. Proceedings of the IEEE International Conference on Computer Vision, pages 2980–2988 (2017)
Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, and Ling Shao. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In Proceedings of the IEEE/CVF international conf-erence on computer vision, pages 568–578 (2021)
Koonce, B., Koonce, B.: Conv-olutional Neural Networks with Swift for Tensorflow: Im-age Recognition and Dataset Categorization. In: Koonce, B. (ed.) Mobilenetv3, pp. 125–144. Apress, Berkeley (2021)
Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chu-njing Xu, and Chang Xu. Ghostnet: More features from cheap operations. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1580–1589 (2020)
Mingxing Tan, Ruoming Pang, and Quoc V Le. Eff-icientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10781–10790 (2020)
Wang, Q., Li, J.: A fast-training gan for coal–gangue image augmentation based on a few samples. The. Visual. Comput. 39(7), 1122–1133 (2023)
Mardieva, S., Ahmad, S., Umirzakova, S., Rasool, M.A., Whangbo, T.K.: Lightweight image super-resolution for IoT devices using deep residual feature distillation network. Knowl.-Based Syst. 285(111343), 7 (2024)
Author information
Authors and Affiliations
Contributions
S. Yan wrote the main manuscript text and prepared most of the figures. W. Liu wrote the introduction section, and Z. Yang contributed to some of the figures.E. Zhang modified some article formats. Y. Chang reviewed the manuscript and provided suggestions for revisions. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yan, S., Liu, W., Yang, Z. et al. L-DEYO: An optimized lightweight model for intelligent coal gangue recognition. J Real-Time Image Proc 22, 91 (2025). https://doi.org/10.1007/s11554-025-01667-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11554-025-01667-1