TIRNet: Object detection in thermal infrared images for autonomous driving

Dai, Xuerui; Yuan, Xue; Wei, Xueye

doi:10.1007/s10489-020-01882-2

TIRNet: Object detection in thermal infrared images for autonomous driving

Published: 19 September 2020

Volume 51, pages 1244–1261, (2021)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

3168 Accesses
64 Citations
Explore all metrics

Abstract

In the present study, towards reliable and efficient object detection in thermal infrared (TIR) images, we put forward a novel object detection approach, termed TIRNet, which is built upon convolutional neural network (CNN). Instead of using the deep CNN backbone (ResNet, ResNeXt) which suffers low speed and high computational cost, the lightweight feature extractor (VGG) is adopted. To get the robust and discriminating features for accurate box regression and classification, the Residual Branch is introduced. More uniquely, it only exists in the training phase, so no any additional time is increased when inference. All the computation is encapsulated in a single network, so our TIRNet can be optimized and tested in the manner of end-to-end. Furthermore, the continuous information fusion strategy is proposed for improving detection performance, which can effectively solve the problems such as complex background, occlusion, and get more accurate and smoother detection results. To get the real-world dataset and effectively evaluate the effectiveness, a China Thermal Infrared (CTIR) dataset is collected. Besides, we also evaluate our proposed approach on the public KAIST Multispectral dataset. As demonstrated in the comparative experiments, our approach gets the state-of-the-art detection accuracy while maintains high detection efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Evaluating Faster-RCNN and YOLOv3 for Target Detection in Multi-sensor Data

Deep Learning-Based Multi-scale Multi-object Detection and Classification for Autonomous Driving

A Comparison: Different DCNN Models for Intelligent Object Detection in Remote Sensing Images

Article 30 June 2018

References

Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The KITTI Vision Benchmark Suite. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), IEEE
Sivaraman S, Trivedi MM (2013) A review of recent developments in vision-based vehicle detection. 2013 IEEE Intelligent Vehicles Symposium (IV), IEEE
Hu Qichang, et al. (2015) Fast detection of multiple objects in traffic scenes with a common detection framework. IEEE Trans Intel Transport Syst 17.4:1002–1014
Google Scholar
Wang Z, Liu J (2017) A review of object detection based on convolutional neural network. In: Control conference
Biswas SK, Milanfar P (2017) Linear support tensor machine with LSK channels: pedestrian detection in thermal infrared images. IEEE Trans Image Process 26.9:4229–4242
Article MathSciNet Google Scholar
Hwang S et al (2015) Multispectral pedestrian detection: Benchmark dataset and baseline. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR) IEEE Computer Society
Mukhtar A, Xia L, Tang TB (2015) Vehicle detection techniques for collision avoidance systems: a review. IEEE Trans Intel Transport Syst 16.5:2318–2338
Article Google Scholar
Li Jianfu, et al. (2010) Robust pedestrian detection in thermal infrared imagery using the wavelet transform. Infrared Physics and Technology 53.4:267–273
Article Google Scholar
Teutsch M et al (2014) Low resolution person detection with a moving thermal infrared camera by hot spot classification. Computer vision & pattern recognition workshops IEEE
Brehar R, Nedevschi S Pedestrian Detection in Infrared Images Using HOG, LBP, Gradient Magnitude and Intensity Feature Channels. In: IEEE international conference on intelligent transportation systems 0
StLaurent M (2007) Prevost Combination of colour and thermal sensors for enhanced object detection. In: International conference on information fusion IEEE
Girshick R et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Li Y, He K, Sun J, et al. (2016) R-FCN: Object detection via region-based fully convolutional networks. In: NIPS
Ren S et al (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. IAdvances in neural information processing systems
Lin T-Y et al (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Redmon J et al (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, Cham, pp 21–37
Fu CY, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd : deconvolutional single shot detector
Lin T-Y, et al. (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision
He K et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Computer Science
Xie S et al (2016) Aggregated Residual Transformations for Deep Neural Networks
Long J, Shelhamer E, Darrell T (2014) Fully convolutional networks for semantic Segmentation[J]. IEEE Trans Pat Anal Mach Intel 39(4):640–651
Google Scholar
Zhao C et al (2020) Similarity learning with joint transfer constraints for person re-identification. Pattern Recognition 97:107014
Article Google Scholar
Zhao C et al (2019) Uncertainty-optimized deep learning model for small-scale person re-identification. Science China Information Sciences 62.12:220102
Article Google Scholar
Zhao C et al (2019) Multilevel triplet deep learning model for person re-identification. Pattern Recognition Letters 117:161–168
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems
Uijlings JRR et al (2013) Selective search for object recognition. Int J Comput Vis 104.2:154–171
Article Google Scholar
Girshick Ross (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision
Bell S et al (2016) Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Shrivastava A et al (2016) Beyond Skip connections: Top-Down Modulation for Object Detection
He K et al (2018) Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp 1–1
Liu Shu, et al. (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Davis JW, Keck MA (2005) A two-stage template approach to person detection in thermal imagery Seventh. IEEE Workshops on Applications of Computer Vision (WACV/MOTION’05), Vol 1, IEEE
Davis JW, Sharma V (2007) Background-subtraction using contour-based fusion of thermal and visible imagery. Comput Vis Image Understanding 106.2-3:162–182
Article Google Scholar
Liu Jingjing et al (2016) Multispectral deep neural networks for pedestrian detection. arXiv:1611.02644
Chen Y, Xie H, Shin H (2018) Multi-layer fusion techniques using a CNN for multispectral pedestrian detection. IET Comput Vis 12.8:1179–1187
Article Google Scholar
Chen L-C, et al. (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transa Pattern Anal Mach Intell 40.4:834–848
Google Scholar
Liu W, Rabinovich A, Berg AC (2015) Parsenet: looking wider to see better. Computer Science
Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement
Gao P et al (2020) Siamese attentional keypoint network for high performance visual tracking. Knowledge-Based Systems 193:105448
Article Google Scholar
Gao P, Zhang Q, Wang F, Xiao L, Fujita H, Zhang Y (2020) Learning reinforced attentional representation for end-to-end visual tracking. Inform Sci 517:52–67
Article Google Scholar
Kalman RE (1960) A new approach to linear filtering and prediction problems. Journal of Fluids Engineering
Kuhn HW (1955) The Hungarian method for the assignment problem. Naval Res Logistics Quarterly 2.1-2:83–97
Article MathSciNet Google Scholar
Howard AG (2013) Some improvements on deep convolutional neural network based image classification computer science
Girija SS (2016) Tensorflow: Large-scale machine learning on heterogeneous distributed systems. Software available from tensorflow org
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Berg AC (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics
Kingma DP, Ba LJ (2015) Adam: A Method for Stochastic Optimization
Bewley A, et al. (2016) Simple online and realtime tracking. In: 2016 IEEE international conference on image processing (ICIP), IEEE

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant No. 61871024, the Key Science Projects of Shanxi Province No. 201903D03111114, and Science and technology project of Shanxi Jinzhong Development Zone.

Author information

Authors and Affiliations

School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, China
Xuerui Dai, Xue Yuan & Xueye Wei

Authors

Xuerui Dai
View author publications
You can also search for this author in PubMed Google Scholar
Xue Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Xueye Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xue Yuan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dai, X., Yuan, X. & Wei, X. TIRNet: Object detection in thermal infrared images for autonomous driving. Appl Intell 51, 1244–1261 (2021). https://doi.org/10.1007/s10489-020-01882-2

Download citation

Published: 19 September 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s10489-020-01882-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TIRNet: Object detection in thermal infrared images for autonomous driving

Abstract

Access this article

Similar content being viewed by others

Evaluating Faster-RCNN and YOLOv3 for Target Detection in Multi-sensor Data

Deep Learning-Based Multi-scale Multi-object Detection and Classification for Autonomous Driving

A Comparison: Different DCNN Models for Intelligent Object Detection in Remote Sensing Images

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

TIRNet: Object detection in thermal infrared images for autonomous driving

Abstract

Access this article

Similar content being viewed by others

Evaluating Faster-RCNN and YOLOv3 for Target Detection in Multi-sensor Data

Deep Learning-Based Multi-scale Multi-object Detection and Classification for Autonomous Driving

A Comparison: Different DCNN Models for Intelligent Object Detection in Remote Sensing Images

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation