Abstract
The current X-ray prohibited item detection system needs to further improve its detection accuracy in complex backgrounds, and also faces challenges in terms of real-time detection. Aiming at the problems of poor image contrast, obscuring object localization, and low detection efficiency, this paper proposes an x-ray prohibited item detection method based on inverted residual layer aggregation IRLAN and lightweight contextual downsampling LCDown. Firstly, FasterNet is adopted as the backbone network of the model to optimize feature representation and improve detection speed. Secondly, the inverted residual layer aggregation strategy and lightweight context downsampling strategy are proposed to enhance the model’s feature fusion ability and to solve the problem of unclear edges and details of the items due to low contrast and noise that usually exists in X-ray images. Finally, the efficient intersection over union loss function is introduced to improve the ability of prohibited item localization and accelerate the convergence speed of the model. Compared with the RT-DETR-R50 algorithm, which has the highest accuracy among the current target detection algorithms, the mAP50 of the proposed method on the SIXray and OPIXray datasets are improved by 1.4% and 1.3%; meanwhile, the FPS reaches 42.9 frames per second, which can meet the requirement of real-time detection. Experimental results with Grad-CAM feature heatmaps show that the proposed method achieves good detection results on public X-ray image datasets.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03763-4/MediaObjects/11760_2024_3763_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03763-4/MediaObjects/11760_2024_3763_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03763-4/MediaObjects/11760_2024_3763_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03763-4/MediaObjects/11760_2024_3763_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03763-4/MediaObjects/11760_2024_3763_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03763-4/MediaObjects/11760_2024_3763_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03763-4/MediaObjects/11760_2024_3763_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11760-024-03763-4/MediaObjects/11760_2024_3763_Fig8_HTML.png)
Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study.
References
Shen, J., Chen, P., Su, L., Shi, T., Tang, Z., Liao, G.: X-Ray inspection of TSV defects with Self-Organizing Map Network and Otsu Algorithm. Microelectron. Reliab. 67, 129–134 (2016)
Weitao Tang, C.M., Vian, Z., Tang, Yang, B.: Anomaly Detection of Core Failures in die casting X-Ray inspection images using a Convolutional Autoencoder. Mach. Vis. Appl. 32(4), 10201–10212 (2021)
Paolo Soda, N.C.D.A., Tessadori, J., Valbusa, G., Guarrasi, V., et al.: AIforCOVID: Predicting the clinical outcomes in patients with Covid-19 applying AI to Chest-X-Rays. An Italian Multicentre Study. Med. Image. Anal. 74, 1022161–1022117 (2021)
Navdeep Kaur and Ajay Mittal: Radiobert: A deep learning-based system for medical report generation from chest X-Ray images using Contextual Embeddings. Biomedical Inf. 135, 1042201–1042210 (2022)
Fan, Z.-J., Zheng, Y.-J.: Evolutionary Optimization of Airport Security Inspection Allocation. International Conference on Simulated Evolution and Learning: 716–726. (2017)
Jiaquan Shen, N., Liu, C., Xu, H., Sun, Y., Xiao, D., Li, Zhang, Y.: Finger Vein Recognition Algorithm based on lightweight deep convolutional neural network. IEEE Trans. Instrum. Meas. 71, 1–13 (2022)
Ping Han, H., Yang, Fang, C.: Identification of prohibited items in X-Ray images using region enhancement and Multi Feature Fusion. Chin. J. Image Graphics. 28(02), 430–440 (2023)
Dan Cristian Dinca and Jeffrey Schubert. Rapid Inspection of General Aviation Aircraft for Security Threats and Contraband. International Conference on Security Technology: 1–5. (2014)
Domingo Mery, E., Svec, M., Arias, V., Riffo, J.M., Saavedra, Banerjee, S.: Modern Computer Vision Techniques for X-Ray Testing in Baggage Inspection. IEEE Trans. Syst. Man. Cybernetics: Syst. 47(4), 682–692 (2017)
Hongyu Yang and Yuhao Feng: A pythagorean fuzzy Petri Net based Security Assessment Model for Civil Aviation Airport Security Inspection Information System. Int. J. Intell. Syst. 36(5), 2122–2143 (2021)
Nicolas Carion, F., Massa, G., Synnaeve, N., Usunier: Alexander Kirillov, and Sergey Zagoruyko. End-to-End Object Detection with Transformers. European Conference on Computer Vision: 213–229. (2020)
Xizhou Zhu, W., Su, L., Lu, B., Li, X., Wang, Dai, J.: Deformable DETR: Deformable Transformers for End-to-End Object Detection. International Conference on Learning Representations: 1–16. (2021)
Zhang, H., Li, F., Liu, S., Zhang, L., Su, H., Zhu, J.: Lionel M. Ni, and Heung-Yeung Shum. DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection. International Conference on Learning Representations: 1–19. (2023)
Lv, W., Xu, S., Zhao, Y., Wang, G., Wei, J., Cui, C.: Yuning Du, Qingqing Dang, and Yi Liu. DETRs Beat YOLOs on Real-time Object Detection. CoRR abs/2304.08069. (2023)
Samet Akcay, M.E., Kundegorski, M., Devereux, Toby, P.: Breckon. Transfer Learning Using Convolutional Neural Networks for Object Classification within X-Ray Baggage Security Imagery. IEEE Conference on Image Processing: 1057–1061. (2016)
Yona Falinie, A., Gaus, N., Bhowmik, S., Akcay, Toby, P.: Breckon. Evaluating the Transferability and Adversarial Discrimination of Convolutional Neural Networks for Threat Object Detection and Classification within X-Ray Security Imagery. International Conference on Machine Learning and Applications: 420–425. (2019)
Zhang, H., Zhang, S.: A YOLOv5s-SE Model for Object Detection in X-Ray Security Images. International Conference on Control, Automation and Information Sciences: 626–631. (2021)
Caijing Miao, L.X., Wan, F., Su, C., Liu, H., Jianbin Jiao, and, Ye, Q.: SIXray: A Large-Scale Security Inspection X-Ray Benchmark for Prohibited Item Discovery in Overlapping Images. IEEE Conference on Computer Vision and Pattern Recognition: 2119–2128. (2019)
Zanshi Wang, X., Wang, Y., Shi, H., Qi, M., Jia, Wang, W.: Lightweight detection method for X-ray security inspection with occlusion. Sensors. 24(3), 10021–10022 (2024)
Songtao, Liu, Di Huang, Wang, Y.: Learning Spatial Fusion for Single-Shot Object Detection. CoRR abs/1911.09516. (2019)
Hou, Q., Zhou, D., Feng, J.: Coordinate Attention for Efficient Mobile Network Design. IEEE Conference on Computer Vision and Pattern Recognition: 13713–13722. (2021)
Zhang, L., Jiang, L., Ji, R., Heng Fan: PIDray: A large-scale X-ray Benchmark for Real-World prohibited item detection. Int. J. Comput. Vision. 131(12), 3170–3192 (2023)
Wei, Y., Tao, R., Wu, Z., Ma, Y., Zhang, L., Liu, X.: Occluded Prohibited Items Detection: An X-ray Security Inspection Benchmark and De-Occlusion Attention Module. ACM Conference on Multimedia: 138–146. (2020)
Tao, R., Wei, Y., Jiang, X., Li, H., Qin, H., Wang, J., Ma, Y., Zhang, L., Liu, X.: Towards Real-world X-ray Security Inspection: A High-Quality Benchmark and Lateral Inhibition Module for Prohibited Items Detection. IEEE Conference on Computer Vision: 10903–10912. (2021)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N.: Lukasz Kaiser, and Illia Polosukhin. Attention Is All You Need. Annual Conference on Neural Information Processing Systems: 5998–6008. (2017)
Tsung-Yi Lin, P., Dollár, R.B., Girshick, K., He, B., Hariharan, Serge, J.: Belongie. Feature Pyramid Networks for Object Detection. IEEE Conference on Computer Vision and Pattern Recognition: 936–944. (2017)
Shu Liu, L., Qi, H., Qin: Jianping Shi, and Jiaya Jia. Path Aggregation Network for Instance Segmentation. IEEE Conference on Computer Vision and Pattern Recognition: 8759–8768. (2018)
Chen, J., Kao, S., He, H., Zhuo, W., Wen, S., Lee, C.: and S. H. Gary Chan. Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks. IEEE Conference on Computer Vision and Pattern Recognition: 12021–12031. (2023)
Chienyao Wang, H.M., Liao, Yeh, I.-H.: Designing Network Design Strategies through gradient path analysis. Inform. Sci. Eng. 39(6), 975–995 (2023)
Sandler, M., Howard, A.G., Zhu, M., Zhmoginov, A., Chen, L.: MobileNetV2: Inverted Residuals and Linear Bottlenecks. IEEE Conference on Computer Vision and Pattern Recognition: 4510–4520. (2018)
François Chollet. Xception: Deep Learning with Depthwise Separable Convolutions. IEEE Conference on Computer Vision and Pattern Recognition: 1800–1807. (2017)
Hamid Rezatofighi, N., Tsoi, J.Y., Gwak, A., Sadeghian, I.D., Reid, Savarese, S.: Generalized Intersection Over Union: A Metric and A Loss for Bounding Box Regression. IEEE Conference on Computer Vision and Pattern Recognition: 658–666. (2019)
Zhang, Y., Ren, W., Zhang, Z., Jia, Z., Wang, L., Tan, T.: Focal and efficient IoU loss for Accurate bounding Box Regression. Neurocomputing. 506, 146–157 (2022)
Zhaohui Zheng, P., Wang, D., Ren, W., Liu, R., Ye: Qinghua Hu, and Wangmeng Zuo. Enhancing geometric factors in Model Learning and inference for object detection and Instance Segmentation. IEEE Trans. Cybernetics. 52(8), 8574–8586 (2022)
Ramprasaath, R., Selvaraju, M., Cogswell, A., Das, R., Vedantam, D., Parikh, D., Batra: Grad-CAM: Visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vision. 128(2), 336–359 (2020)
Tsungyi Lin, P., Goyal, R.B., Girshick: Kaiming He, and Piotr Dollár. Focal Loss for Dense Object Detection. IEEE Conference on Computer Vision: 2999–3007. (2017)
Wei Liu, D., Anguelov, D., Erhan, C., Szegedy, S.E., Reed, C.-Y., Fu, Alexander, C.: Berg. SSD: Single Shot MultiBox Detector. European Conference on Computer Vision: 21–37. (2016)
Glenn Jocher. Ultralytics YOLOv8: (2023). https://github.com/ultralytics/ultralytics
Chienyao Wang, I.-H., Yeh: and Hongyuan Mark Liao. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. CoRR (2024). abs/2402.13616
Shaoqing Ren, K., He, R.B., Girshick, Sun, J., Faster, R.-C.N.N.: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Acknowledgements
The authors gratefully acknowledge the helpful comments and suggestions of the reviewers and the editors, which helped to improve the quality of this paper.
Author information
Authors and Affiliations
Contributions
Yixuan Wang: Conceptualization, Methodology, Software. Songhao Zhu: Supervision. Zhiwei Liang: Writing-Reviewing and Editing. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Ethical approval
This declaration is not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, Y., Zhu, S. & Liang, Z. X-ray prohibited item detection via inverted residual layer aggregation and lightweight contextual downsampling. SIViP 19, 149 (2025). https://doi.org/10.1007/s11760-024-03763-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11760-024-03763-4