Skip to main content
Log in

Adaptive threshold cascade faster RCNN for domain adaptive object detection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Object detection usually assumes that training and test data come from the same distribution, but the assumption is not always hold in practice. Due to domain shift problem, applying a trained detector to a new domain will lead to a great decrease in detection accuracy. Domain adaptive object detection has been adopted to maintain high detection accuracy in the face of various domain shift problems. Domain adaptive object detection methods mainly include adversarial-based methods, discrepancy-based methods, reconstruction-based methods, hybrid methods and others. Domain adaptive Faster RCNN is a classical adversarial-based method. In order to further improve the accuracy of domain adaptive object detection, we propose a method based on the Domain adaptive Faster RCNN called adaptive threshold cascade Faster RCNN (ATCFR). The ATCFR introduces the cascade strategy and adaptive threshold strategy. The cascade strategy improves the quality of bounding boxes and solves the problem of overfitting and mismatch in Faster RCNN. The adaptive threshold strategy ensures the balance of positive and negative samples and we don’t have to manually set the threshold as we did in cascade RCNN. In the end, we evaluate our new approach by using four classic datasets, including Cityscapes, Foggy Cityscapes, SIM 10k and KITTI. Experimental results show that our method has higher accuracy in variousdomain shift problems, compared with the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Ali A, Zhu Y, Zakarya M (2021) A data aggregation based approach to exploit dynamic spatio-temporal correlations for citywide crowd flows prediction in fog computing. Multimed Tools Appl. https://doi.org/10.1007/s11042-020-10486-4

  2. Ali A, Zhu Y, Chen Q, Yu J, Cai H (2019) Leveraging spatio-temporal patterns for predicting citywide traffic crowd flows using deep hybrid neural networks. In: 2019 IEEE 25th International conference on parallel and distributed systems (ICPADS), IEEE. pp 125–132

  3. Arruda VF, Paixão TM, Berriel RF, De Souza AF, Badue C, Sebe N, Oliveira-Santos T (2019) Cross-domain car detection using unsupervised image-to-image translation: From day to night. In: 2019 International joint conference on neural networks (IJCNN), IEEE. pp 1–8

  4. Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: Optimal speed and accuracy of object detection. arXiv:2004.10934

  5. Cai Q, Pan Y, Ngo C-W, Tian X, Duan L, Yao T (2019) Exploring object relation in mean teacher for cross-domain detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 11457–11466

  6. Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162

  7. Cao Y, Guan D, Huang W, Yang J, Cao Y, Qiao Y (2019) Pedestrian detection with unsupervised multispectral feature learning using deep neural networks. Inf Fusion 46:206–217

    Article  Google Scholar 

  8. Chen Y, Li W, Sakaridis C, Dai D, Van Gool L (2018) Domain adaptive faster r-cnn for object detection in the wild. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3339–3348

  9. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3213–3223

  10. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection

  11. Felzenszwalb P, McAllester D, Ramanan D (2008) A disccle, deformable part model. In: 2008 IEEE conference on computer vision and pattern recognition, IEEE. pp 1–8

  12. Fu C-Y, Liu W, Ranga A, Tyagi A, Berg AC (2017) Dssd: Deconvolutional single shot detector. arXiv:1701.06659

  13. Ganin Y, Lempitsky V (2014) Unsupervised domain adaptation by backpropagation. arXiv:1409.7495

  14. Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: The kitti dataset. Int J Robot Res 32(11):1231–1237

    Article  Google Scholar 

  15. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

  16. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

  17. Guo T, Huynh CP, Solh M (2019) Domain-adaptive pedestrian detection in thermal images. In: 2019 IEEE International conference on image processing (ICIP), IEEE, pp 1660–1664

  18. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Article  Google Scholar 

  19. He Z, Zhang L (2020) Domain adaptive object detection via asymmetric tri-way faster-rcnn. arXiv:2007.01571

  20. Hsu H-K, Yao C-H, Tsai Y-H, Hung W-C, Tseng H-Y, Singh M, Yang M-H (2020) Progressive domain adaptation for object detection. In: The IEEE winter conference on applications of computer vision, pp 749–757

  21. Johnson-Roberson M, Barto C, Mehta R, Sridhar SN, Rosaen K, Vasudevan R (2016) Driving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks?. arXiv:1610.01983

  22. Khodabandeh M, Vahdat A, Ranjbar M, Macready WG (2019) A robust learning approach to domain adaptive object detection. In: Proceedings of the IEEE international conference on computer vision, pp 480–490

  23. Li W, Li F, Luo Y, Wang P (2020) Deep domain adaptive object detection: a survey. arXiv:2002.06797

  24. Lin C-T (2019) Cross domain adaptation for on-road object detection using multimodal structure-consistent image-to-image translation. In: 2019 IEEE International conference on image processing (ICIP), IEEE. pp 3029–3030

  25. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision, Springer. pp 21–37

  26. Redmon J, Farhadi A (2016) Yolo9000: Better, faster, stronger. arXiv:1612.08242

  27. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  28. Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv:1804.02767

  29. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  30. Saito K, Ushiku Y, Harada T, Saenko K (2019) Strong-weak distribution alignment for adaptive object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6956–6965

  31. Sakaridis C, Dai D, Van Gool L (2018) Semantic foggy scene understanding with synthetic data. Int J Comput Vis 126(9):973–992

    Article  Google Scholar 

  32. Shan Y, Lu WF, Chew CM (2019) Pixel and feature level based domain adaptation for object detection in autonomous driving. Neurocomputing 367:31–38

    Article  Google Scholar 

  33. Viola P, Jones M, et al. (2001) Rapid object detection using a boosted cascade of simple features. CVPR (1) 1(511-518):3

    Google Scholar 

  34. Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154

    Article  Google Scholar 

  35. Wang T, Zhang X, Yuan L, Feng J (2019) Few-shot adaptive faster r-cnn. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7173–7182

  36. Xu CD, Zhao XR, Jin X, Wei XS (2020) Exploring categorical regularization for domain adaptive object detection. IEEE

  37. Xu M, Wang H, Ni B, Tian Q, Zhang W (2020) Cross-domain detection via graph-induced prototype alignment. In: 2020 IEEE/CVF conference on computer vision and pattern recognition (CVPR)

  38. Zheng Y, Huang D, Liu S, Wang Y (2020) Cross-domain object detection through coarse-to-fine feature adaptation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13766–13775

  39. Zou Z, Shi Z, Guo Y, Ye J (2019) Object detection in 20 years: A survey. arXiv:1905.05055

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China under Grant 61802056, in part by Natural Science Foundation of Jilin Province under Grant 20180101043JC, in part by Development and Reform Committee Foundation of Jilin province of China under Grant 2019C053-9, and in part by the Open Research Fund of Key Laboratory of Space Utilization, Chinese Academy of Sciences, under Grant LSU-KFJJ-2019-08.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haihong Yu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shi, X., Li, Z. & Yu, H. Adaptive threshold cascade faster RCNN for domain adaptive object detection. Multimed Tools Appl 80, 25291–25308 (2021). https://doi.org/10.1007/s11042-021-10917-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-10917-w

Keywords

Navigation