Skip to main content
Log in

Multi-scale feature correspondence and restriction mechanism for visible X-ray baggage re-Identification

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Recently, social security surveillance has posed a new AI challenge, i.e., Visible-X-ray baggage Re-Identification (VX-ReID), which aims to re-identify and retrieve baggage between visible and X-ray imaging modalities. Compared with cross-modality person re-identification, VX-ReID has two distinctive bottlenecks: shape deformation and feature entanglement. For the former, the shape of the baggage can change largely, resulting in serious feature unrobustness. For the latter, the X-ray images often contain the contents of the baggage, which are not visible in daylight images. These will greatly affect the performance of representational learning loss functions (like ID Loss) in the Re-ID task. In this paper, we propose a cross-modality multi-scale feature correspondence model (CMMFC) for VX-ReID. Specifically, we devise and calculate multiple feature correspondences between modalities on multiple-scale feature maps endowed to overcome the deformation problem. We also utilize a novel feature restriction mechanism (FRM) to alleviate the feature entanglement problem, which imposes different constraints on features at different scales and accurately drives networks to distinctive modality-irrelevant features. Finally, CMMFC is extensively evaluated on our dataset RX01. Experiments show that our proposed method achieves state-of-the-art performance on dataset RX01.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Availability of data and materials

This study did not report any data. The proposed method was evaluated on the available dataset: RX01(S. Chan, J. Cui, Y. Wu, H. Wang and C. Bai, "Visible-Xray Cross-Modality Package Re-Identification," 2023 IEEE International Conference on Multimedia and Expo (ICME), Brisbane, Australia, 2023, pp. 2579-2584, doi: 10.1109/ICME55011.2023.00439).

References

  1. Brown, A., Xie, W., Kalogeiton, V., et al.: Smooth-ap: smoothing the path towards large-scale image retrieval. In: European Conference on Computer Vision, Springer, pp 677–694 (2020)

  2. Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 500–513 (2010)

    Article  Google Scholar 

  3. Chan, S., Cui, J., Wu, Y., et al.: Visible-xray cross-modality package re-identification. In: IEEE International Conference on Multimedia and Expo, ICME 2023, Brisbane, Australia, July 10–14, 2023. IEEE, pp 2579–2584 (2023)

  4. Chen, Y., Wan, L., Li, Z., et al.: Neural feature search for rgb-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 587–597 (2021)

  5. Choi, S., Lee, S., Kim, Y., et al.: Hi-cmd: hierarchical cross-modality disentanglement for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10257–10266 (2020)

  6. Dosovitskiy, A., Fischer, P., Ilg, E., et al.: Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2758–2766 (2015)

  7. Fan, X., Jiang, W., Luo, H., et al.: Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal person re-identification. Vis. Comput. 38, 279–294 (2020)

    Article  Google Scholar 

  8. Gao, Y., Liang, T., Jin, Y., et al.: Mso: multi-feature space joint optimization network for rgb-infrared person re-identification. In: Proceedings of the 29th ACM International Conference on Multimedia, pp 5257–5265 (2021)

  9. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778 (2016)

  10. Hermans, A., Beyer, L., Leibe, B.: In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737 (2017)

  11. Li, D., Wei, X., Hong, X., et al.: Infrared-visible cross-modal person re-identification with an x modality. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 4610–4617 (2020)

  12. Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2117–2125 (2017)

  13. Liu, C., Yuen, J., Torralba, A.: Sift flow: dense correspondence across scenes and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 978–994 (2010)

    Article  Google Scholar 

  14. Liu, H., Tan, X., Zhou, X.: Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification. IEEE Trans. Multimed. 23, 4414–4425 (2020)

    Article  Google Scholar 

  15. Lu, H., Zou, X., Zhang, P.: Learning progressive modality-shared transformers for effective visible-infrared person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 1835–1843 (2023)

  16. Mery, D., Riffo, V., Zscherpel, U., et al.: Gdxray: the database of x-ray images for nondestructive testing. J. Nondestr. Eval. 34(4), 42 (2015)

    Article  Google Scholar 

  17. Miao, C., Xie, L., Wan, F., et al.: Sixray: a large-scale security inspection x-ray benchmark for prohibited item discovery in overlapping images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2119–2128 (2019)

  18. Park, H., Lee, S., Lee, J., et al.: Learning by aligning: visible-infrared person re-identification using cross-modal correspondences. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 12046–12055 (2021)

  19. Pu, N., Chen, W., Liu, Y., et al.: Dual gaussian-based variational subspace disentanglement for visible-infrared person re-identification. In: Proceedings of the 28th ACM International Conference on Multimedia, pp 2149–2158 (2020)

  20. Sun, H., Liu, J., Zhang, Z., et al.: Not all pixels are matched: dense contrastive learning for cross-modality person re-identification. In: Proceedings of the 30th ACM International Conference on Multimedia, pp 5333–5341 (2022)

  21. Sun, Y., Zheng, L., Yang, Y., et al.: Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European Conference on Computer Vision (ECCV), pp 480–496 (2018)

  22. Szeliski, R.: Image alignment and stitching: a tutorial. Found Trends Comput Graph Vis 2(1), 1–104 (2006)

    Article  MathSciNet  Google Scholar 

  23. Tao, R., Wei, Y., Jiang, X., et al.: Towards real-world x-ray security inspection: a high-quality benchmark and lateral inhibition module for prohibited items detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 10923–10932 (2021)

  24. Wang, G., Yuan, Y., Chen, X., et al.: Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM International Conference on Multimedia, pp 274–282 (2018)

  25. Wang, G., Zhang, T., Cheng, J., et al.: Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 3623–3632 (2019)

  26. Wang, G.A., Zhang, T., Yang, Y., et al.: Cross-modality paired-images generation for rgb-infrared person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 12144–12151 (2020)

  27. Wei, X., Li, D., Hong, X., et al.: Co-attentive lifting for infrared-visible person re-identification. In: Proceedings of the 28th ACM International Conference on Multimedia, pp 1028–1037 (2020)

  28. Wei, Y., Tao, R., Wu, Z., et al.: Occluded prohibited items detection: An x-ray security inspection benchmark and de-occlusion attention module. In: Proceedings of the 28th ACM International Conference on Multimedia, pp 138–146 (2020)

  29. Wu, A., Zheng, W.S., Yu, H.X., et al.: Rgb-infrared cross-modality person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5380–5389 (2017)

  30. Wu, Q., Dai, P., Chen, J., et al.: Discover cross-modality nuances for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 4330–4339 (2021)

  31. Yang, M., Huang, Z., Hu, P., et al.: Learning with twin noisy labels for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 14308–14317 (2022)

  32. Ye, H., Liu, H., Meng, F., et al.: Bi-directional exponential angular triplet loss for rgb-infrared person re-identification. IEEE Trans. Image Process. 30, 1583–1595 (2020)

    Article  Google Scholar 

  33. Ye, M., Lan, X., Li, J., et al.: Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)

  34. Ye, M., Shen, J., J Crandall, D., et al.: Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: European Conference on Computer Vision, Springer, pp 229–247 (2020)

  35. Ye, M., Shen, J., Lin, G., et al.: Deep learning for person re-identification: a survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 2872–2893 (2022)

    Article  Google Scholar 

  36. Zhang, Y., Wang, H.: Diverse embedding expansion network and low-light cross-modality benchmark for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 2153–2162 (2023)

  37. Zhou, B., Khosla, A., Lapedriza, A., et al.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2921–2929 (2016)

  38. Zhu, Y., Yang, Z., Wang, L., et al.: Hetero-center loss for cross-modality person re-identification. Neurocomputing 386, 97–109 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

This work is partially supported by the Zhejiang Provincial Natural Science Foundation of China (No. LY23F020023), Anhui key Laboratory of Bionic Sensing and AdvancedRobot Technology Project (AHFS2024KF04) and the National Natural Science Foundation of China under Grant (No. U20A20196, 61906168).

Author information

Authors and Affiliations

Authors

Contributions

All authors reviewed the manuscript. Conceptualization: Sixian Chan, Jiaao Cui and Hongqiang Wang. Investigation: Sixian Chan, Jiaao Cui, Yonggan Wu,Hongqiang Wang and Cong Bai. Software and validation: Jiaao Cui, Yonggan Wu and Sixian Chan. Writing—original draft preparation: Sixian Chan, Jiaao Cui and Hongqiang Wang. Formal analysis: Sixian Chan, Jiaao Cui, Yonggan Wu, Cong Bai and Hongqiang Wang. Funding acquisition: Cong Bai and Sixian Chan. Prepared figures: Sixian Chan, Jiaao Cui and Yonggan Wu. Interpretation of data: Sixian Chan, Jiaao Cui, Yonggan Wu and Hongqiang Wang.

Corresponding author

Correspondence to Hongqiang Wang.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Communicated by Bing-kun Bao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chan, S., Cui, J., Wu, Y. et al. Multi-scale feature correspondence and restriction mechanism for visible X-ray baggage re-Identification. Multimedia Systems 30, 315 (2024). https://doi.org/10.1007/s00530-024-01513-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00530-024-01513-7

Keywords

Navigation