Abstract
Multi-spectral pedestrian detection has attracted extensive attention in recent years. In particular, the combination of RGB and thermal infrared images allows the around-the-clock applications, even in the poor illumination conditions. Considering the fact that RGB and thermal infrared (RGB-T) image pairs are not well aligned, it leads to the inaccuracy of pedestrian detection. To this end, this paper proposes a Multi-scale Alignment and Differential Enhancement Network (MADENet) for multi-spectral pedestrian detection, consisting of Cross-Modality Differential Enhancement Module (CDEM) and Multi-scale Spatial Alignment Module (MSAM). CDEM module is embedded in the backbone to suppress the redundant features and extract complementary information between modalities, and MSAM module is designed to align the RGB-T features by the transformation of thermal features using features of RGB image as the reference. The proposed network is evaluated on the public KAIST dataset across different scenarios. Experimental results demonstrate that the proposed method outperforms the state-of-the-art methods. Miss rate using all test set can reach 8.01.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chen, Z., Huang, X.: Pedestrian detection for autonomous vehicle using Multispectral cameras. IEEE Trans. Intell. Veh. 4(2), 211–219 (2019)
Selvi, C., Amudha, J.: Automatic video surveillance system for pedestrian crossing using digital image processing. Indian J. Sci. Technol. 12, 1–6 (2019)
Buddharaju, P., Pavlidis, I.T., Tsiamyrtzis, P., Bazakos, M.: Physiology-based face recognition in the thermal infrared spectrum. IEEE Trans. Pattern Anal. Mach. Intell. 29(4), 613–626 (2007)
Hwang, S., Park, J., Kim, N., Choi, Y., Kweon, I.S.: Multispectral pedestrian detection: benchmark dataset and baseline. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1037–1045 (2015)
Konig, D., Adam, M., Jarvers, C., Layher, G., Neumann, H., Teutsch, M.: Fully convolutional region proposal networks for multispectral person detection. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 243–250 (2017)
Ding, L., Wang, Y., Laganière, R., Huang, D., Luo, X., Zhang, H.: A robust and fast multispectral pedestrian detection deep network. Knowl.-Based Syst. 227, 106990 (2021)
Cao, Y., Guan, D., Huang, W., Yang, J., Cao, Y., Qiao, Y.: Pedestrian detection with unsupervised multispectral feature learning using deep neural networks. Inf. Fusion 46, 206–217 (2019)
Zhang, L., Zhu, X., Chen, X., Yang, X., Lei, Z., Liu, Z.: Weakly aligned cross-modal learning for multispectral pedestrian detection. In: IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5126–5136 (2019)
Zhou, K., Chen, L., Cao, X.: Improving multispectral pedestrian detection by addressing modality imbalance problems. In: Computer Vision – ECCV 2020, pp. 787–803. Springer, Cham (2020)
Ren, J.S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2016)
Liu, J., Zhang, S., Wang, S., Metaxas, D.: Multispectral deep neural networks for pedestrian detection. In: Richard, E.R.H., Wilson, C., Smith, W.A.P. (eds.) Proceedings of the British Machine Vision Conference (BMVC), pp. 73.1–73.13. BMVA Press (2016)
Li, C., Song, D., Tong, R., Tang, M.: Multispectral pedestrian detection via simultaneous detection and segmentation (2018)
Zhang, L., et al.: Cross-modality interactive attention network for multispectral pedestrian detection. Inf. Fusion 50, 20–29 (2019)
Guan, D., Cao, Y., Yang, J., Cao, Y., Yang, M.Y.: Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection. Inf. Fusion 50, 148–157 (2019)
Cheng, C., Wu, X.-J., Xu, T., Chen, G.: Unifusion: a lightweight unified image fusion network. IEEE Trans. Instrum. Meas. 70, 1–14 (2021)
Li, M., Tang, R.: Illumination-aware faster R-CNN for robust multispectral pedestrian detection. Pattern Recognit. J. Pattern Recognit. Soc. 85 (2019)
Hua, C., Sun, M., Zhu, Y., Jiang, Y., Yu, J., Chen, Y.: Pedestrian detection network with multi-modal cross-guided learning. Digit. Signal Process. 103370 (2022)
Liu, W., et al.: SSD: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer, Cham (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Liu, W., Liao, S., Hu, W., Liang, X., Chen, X.: Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 618–634 (2018)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2999–3007 (2017)
Acknowledgment
This work was sponsored by Beijing Nova Program (20230484409), National Natural Science Foundation of China (62272322, 62272323), applied basic research project of Liaoning province (2022JH2/101300279).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Shao, Z., Chen, Y., Zou, Y., Zhang, J., Guan, Y. (2025). Weakly Aligned Multi-spectral Pedestrian Detection via Cross-Modality Differential Enhancement and Multi-scale Spatial Alignment. In: Antonacopoulos, A., Chaudhuri, S., Chellappa, R., Liu, CL., Bhattacharya, S., Pal, U. (eds) Pattern Recognition. ICPR 2024. Lecture Notes in Computer Science, vol 15330. Springer, Cham. https://doi.org/10.1007/978-3-031-78113-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-78113-1_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-78112-4
Online ISBN: 978-3-031-78113-1
eBook Packages: Computer ScienceComputer Science (R0)