Skip to main content
Log in

TIRNet: Object detection in thermal infrared images for autonomous driving

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In the present study, towards reliable and efficient object detection in thermal infrared (TIR) images, we put forward a novel object detection approach, termed TIRNet, which is built upon convolutional neural network (CNN). Instead of using the deep CNN backbone (ResNet, ResNeXt) which suffers low speed and high computational cost, the lightweight feature extractor (VGG) is adopted. To get the robust and discriminating features for accurate box regression and classification, the Residual Branch is introduced. More uniquely, it only exists in the training phase, so no any additional time is increased when inference. All the computation is encapsulated in a single network, so our TIRNet can be optimized and tested in the manner of end-to-end. Furthermore, the continuous information fusion strategy is proposed for improving detection performance, which can effectively solve the problems such as complex background, occlusion, and get more accurate and smoother detection results. To get the real-world dataset and effectively evaluate the effectiveness, a China Thermal Infrared (CTIR) dataset is collected. Besides, we also evaluate our proposed approach on the public KAIST Multispectral dataset. As demonstrated in the comparative experiments, our approach gets the state-of-the-art detection accuracy while maintains high detection efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The KITTI Vision Benchmark Suite. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR), IEEE

  2. Sivaraman S, Trivedi MM (2013) A review of recent developments in vision-based vehicle detection. 2013 IEEE Intelligent Vehicles Symposium (IV), IEEE

  3. Hu Qichang, et al. (2015) Fast detection of multiple objects in traffic scenes with a common detection framework. IEEE Trans Intel Transport Syst 17.4:1002–1014

    Google Scholar 

  4. Wang Z, Liu J (2017) A review of object detection based on convolutional neural network. In: Control conference

  5. Biswas SK, Milanfar P (2017) Linear support tensor machine with LSK channels: pedestrian detection in thermal infrared images. IEEE Trans Image Process 26.9:4229–4242

    Article  MathSciNet  Google Scholar 

  6. Hwang S et al (2015) Multispectral pedestrian detection: Benchmark dataset and baseline. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR) IEEE Computer Society

  7. Mukhtar A, Xia L, Tang TB (2015) Vehicle detection techniques for collision avoidance systems: a review. IEEE Trans Intel Transport Syst 16.5:2318–2338

    Article  Google Scholar 

  8. Li Jianfu, et al. (2010) Robust pedestrian detection in thermal infrared imagery using the wavelet transform. Infrared Physics and Technology 53.4:267–273

    Article  Google Scholar 

  9. Teutsch M et al (2014) Low resolution person detection with a moving thermal infrared camera by hot spot classification. Computer vision & pattern recognition workshops IEEE

  10. Brehar R, Nedevschi S Pedestrian Detection in Infrared Images Using HOG, LBP, Gradient Magnitude and Intensity Feature Channels. In: IEEE international conference on intelligent transportation systems 0

  11. StLaurent M (2007) Prevost Combination of colour and thermal sensors for enhanced object detection. In: International conference on information fusion IEEE

  12. Girshick R et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  13. Li Y, He K, Sun J, et al. (2016) R-FCN: Object detection via region-based fully convolutional networks. In: NIPS

  14. Ren S et al (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. IAdvances in neural information processing systems

  15. Lin T-Y et al (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  16. Redmon J et al (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  17. Redmon J, Farhadi A (2017) YOLO9000: Better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  18. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision. Springer, Cham, pp 21–37

  19. Fu CY, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd : deconvolutional single shot detector

  20. Lin T-Y, et al. (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision

  21. He K et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  22. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Computer Science

  23. Xie S et al (2016) Aggregated Residual Transformations for Deep Neural Networks

  24. Long J, Shelhamer E, Darrell T (2014) Fully convolutional networks for semantic Segmentation[J]. IEEE Trans Pat Anal Mach Intel 39(4):640–651

    Google Scholar 

  25. Zhao C et al (2020) Similarity learning with joint transfer constraints for person re-identification. Pattern Recognition 97:107014

    Article  Google Scholar 

  26. Zhao C et al (2019) Uncertainty-optimized deep learning model for small-scale person re-identification. Science China Information Sciences 62.12:220102

    Article  Google Scholar 

  27. Zhao C et al (2019) Multilevel triplet deep learning model for person re-identification. Pattern Recognition Letters 117:161–168

    Article  Google Scholar 

  28. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems

  29. Uijlings JRR et al (2013) Selective search for object recognition. Int J Comput Vis 104.2:154–171

    Article  Google Scholar 

  30. Girshick Ross (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision

  31. Bell S et al (2016) Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  32. Shrivastava A et al (2016) Beyond Skip connections: Top-Down Modulation for Object Detection

  33. He K et al (2018) Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp 1–1

  34. Liu Shu, et al. (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  35. Davis JW, Keck MA (2005) A two-stage template approach to person detection in thermal imagery Seventh. IEEE Workshops on Applications of Computer Vision (WACV/MOTION’05), Vol 1, IEEE

  36. Davis JW, Sharma V (2007) Background-subtraction using contour-based fusion of thermal and visible imagery. Comput Vis Image Understanding 106.2-3:162–182

    Article  Google Scholar 

  37. Liu Jingjing et al (2016) Multispectral deep neural networks for pedestrian detection. arXiv:1611.02644

  38. Chen Y, Xie H, Shin H (2018) Multi-layer fusion techniques using a CNN for multispectral pedestrian detection. IET Comput Vis 12.8:1179–1187

    Article  Google Scholar 

  39. Chen L-C, et al. (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transa Pattern Anal Mach Intell 40.4:834–848

    Google Scholar 

  40. Liu W, Rabinovich A, Berg AC (2015) Parsenet: looking wider to see better. Computer Science

  41. Shrivastava A, Gupta A, Girshick R (2016) Training region-based object detectors with online hard example mining. In: Proceedings of the IEEE conference on computer vision and pattern recognition

  42. Redmon J, Farhadi A (2018) Yolov3: an incremental improvement

  43. Gao P et al (2020) Siamese attentional keypoint network for high performance visual tracking. Knowledge-Based Systems 193:105448

    Article  Google Scholar 

  44. Gao P, Zhang Q, Wang F, Xiao L, Fujita H, Zhang Y (2020) Learning reinforced attentional representation for end-to-end visual tracking. Inform Sci 517:52–67

    Article  Google Scholar 

  45. Kalman RE (1960) A new approach to linear filtering and prediction problems. Journal of Fluids Engineering

  46. Kuhn HW (1955) The Hungarian method for the assignment problem. Naval Res Logistics Quarterly 2.1-2:83–97

    Article  MathSciNet  Google Scholar 

  47. Howard AG (2013) Some improvements on deep convolutional neural network based image classification computer science

  48. Girija SS (2016) Tensorflow: Large-scale machine learning on heterogeneous distributed systems. Software available from tensorflow org

  49. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Berg AC (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252

    Article  MathSciNet  Google Scholar 

  50. Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics

  51. Kingma DP, Ba LJ (2015) Adam: A Method for Stochastic Optimization

  52. Bewley A, et al. (2016) Simple online and realtime tracking. In: 2016 IEEE international conference on image processing (ICIP), IEEE

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant No. 61871024, the Key Science Projects of Shanxi Province No. 201903D03111114, and Science and technology project of Shanxi Jinzhong Development Zone.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xue Yuan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dai, X., Yuan, X. & Wei, X. TIRNet: Object detection in thermal infrared images for autonomous driving. Appl Intell 51, 1244–1261 (2021). https://doi.org/10.1007/s10489-020-01882-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-020-01882-2

Keywords

Navigation