Abstract
In the present study, towards reliable and efficient object detection in thermal infrared (TIR) images, we put forward a novel object detection approach, termed TIRNet, which is built upon convolutional neural network (CNN). Instead of using the deep CNN backbone (ResNet, ResNeXt) which suffers low speed and high computational cost, the lightweight feature extractor (VGG) is adopted. To get the robust and discriminating features for accurate box regression and classification, the Residual Branch is introduced. More uniquely, it only exists in the training phase, so no any additional time is increased when inference. All the computation is encapsulated in a single network, so our TIRNet can be optimized and tested in the manner of end-to-end. Furthermore, the continuous information fusion strategy is proposed for improving detection performance, which can effectively solve the problems such as complex background, occlusion, and get more accurate and smoother detection results. To get the real-world dataset and effectively evaluate the effectiveness, a China Thermal Infrared (CTIR) dataset is collected. Besides, we also evaluate our proposed approach on the public KAIST Multispectral dataset. As demonstrated in the comparative experiments, our approach gets the state-of-the-art detection accuracy while maintains high detection efficiency.
Similar content being viewed by others
References
Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The KITTI Vision Benchmark Suite. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), IEEE
Sivaraman S, Trivedi MM (2013) A review of recent developments in vision-based vehicle detection. 2013 IEEE Intelligent Vehicles Symposium (IV), IEEE
Hu Qichang, et al. (2015) Fast detection of multiple objects in traffic scenes with a common detection framework. IEEE Trans Intel Transport Syst 17.4:1002–1014
Wang Z, Liu J (2017) A review of object detection based on convolutional neural network. In: Control conference
Biswas SK, Milanfar P (2017) Linear support tensor machine with LSK channels: pedestrian detection in thermal infrared images. IEEE Trans Image Process 26.9:4229–4242
Hwang S et al (2015) Multispectral pedestrian detection: Benchmark dataset and baseline. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR) IEEE Computer Society
Mukhtar A, Xia L, Tang TB (2015) Vehicle detection techniques for collision avoidance systems: a review. IEEE Trans Intel Transport Syst 16.5:2318–2338
Li Jianfu, et al. (2010) Robust pedestrian detection in thermal infrared imagery using the wavelet transform. Infrared Physics and Technology 53.4:267–273
Teutsch M et al (2014) Low resolution person detection with a moving thermal infrared camera by hot spot classification. Computer vision & pattern recognition workshops IEEE
Brehar R, Nedevschi S Pedestrian Detection in Infrared Images Using HOG, LBP, Gradient Magnitude and Intensity Feature Channels. In: IEEE international conference on intelligent transportation systems 0
StLaurent M (2007) Prevost Combination of colour and thermal sensors for enhanced object detection. In: International conference on information fusion IEEE
Girshick R et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Li Y, He K, Sun J, et al. (2016) R-FCN: Object detection via region-based fully convolutional networks. In: NIPS
Ren S et al (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. IAdvances in neural information processing systems
Lin T-Y et al (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Redmon J et al (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, Cham, pp 21–37
Fu CY, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd : deconvolutional single shot detector
Lin T-Y, et al. (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision
He K et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Computer Science
Xie S et al (2016) Aggregated Residual Transformations for Deep Neural Networks
Long J, Shelhamer E, Darrell T (2014) Fully convolutional networks for semantic Segmentation[J]. IEEE Trans Pat Anal Mach Intel 39(4):640–651
Zhao C et al (2020) Similarity learning with joint transfer constraints for person re-identification. Pattern Recognition 97:107014
Zhao C et al (2019) Uncertainty-optimized deep learning model for small-scale person re-identification. Science China Information Sciences 62.12:220102
Zhao C et al (2019) Multilevel triplet deep learning model for person re-identification. Pattern Recognition Letters 117:161–168
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems
Uijlings JRR et al (2013) Selective search for object recognition. Int J Comput Vis 104.2:154–171
Girshick Ross (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision
Bell S et al (2016) Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Shrivastava A et al (2016) Beyond Skip connections: Top-Down Modulation for Object Detection
He K et al (2018) Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp 1–1
Liu Shu, et al. (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Davis JW, Keck MA (2005) A two-stage template approach to person detection in thermal imagery Seventh. IEEE Workshops on Applications of Computer Vision (WACV/MOTION’05), Vol 1, IEEE
Davis JW, Sharma V (2007) Background-subtraction using contour-based fusion of thermal and visible imagery. Comput Vis Image Understanding 106.2-3:162–182
Liu Jingjing et al (2016) Multispectral deep neural networks for pedestrian detection. arXiv:1611.02644
Chen Y, Xie H, Shin H (2018) Multi-layer fusion techniques using a CNN for multispectral pedestrian detection. IET Comput Vis 12.8:1179–1187
Chen L-C, et al. (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transa Pattern Anal Mach Intell 40.4:834–848
Liu W, Rabinovich A, Berg AC (2015) Parsenet: looking wider to see better. Computer Science
Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement
Gao P et al (2020) Siamese attentional keypoint network for high performance visual tracking. Knowledge-Based Systems 193:105448
Gao P, Zhang Q, Wang F, Xiao L, Fujita H, Zhang Y (2020) Learning reinforced attentional representation for end-to-end visual tracking. Inform Sci 517:52–67
Kalman RE (1960) A new approach to linear filtering and prediction problems. Journal of Fluids Engineering
Kuhn HW (1955) The Hungarian method for the assignment problem. Naval Res Logistics Quarterly 2.1-2:83–97
Howard AG (2013) Some improvements on deep convolutional neural network based image classification computer science
Girija SS (2016) Tensorflow: Large-scale machine learning on heterogeneous distributed systems. Software available from tensorflow org
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Berg AC (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics
Kingma DP, Ba LJ (2015) Adam: A Method for Stochastic Optimization
Bewley A, et al. (2016) Simple online and realtime tracking. In: 2016 IEEE international conference on image processing (ICIP), IEEE
Acknowledgements
This work was supported by the National Natural Science Foundation of China under Grant No. 61871024, the Key Science Projects of Shanxi Province No. 201903D03111114, and Science and technology project of Shanxi Jinzhong Development Zone.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dai, X., Yuan, X. & Wei, X. TIRNet: Object detection in thermal infrared images for autonomous driving. Appl Intell 51, 1244–1261 (2021). https://doi.org/10.1007/s10489-020-01882-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-01882-2