RT-Net: replay-and-transfer network for class incremental object detection

Cui, Bo; Hu, Guyue; Yu, Shan

doi:10.1007/s10489-022-03509-0

RT-Net: replay-and-transfer network for class incremental object detection

Published: 03 August 2022

Volume 53, pages 8864–8878, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Bo Cui^1,2,
Guyue Hu^1,2^nAff3 &
Shan Yu^1,4,5

306 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Despite the remarkable performance achieved by DNN-based object detectors, class incremental object detection (CIOD) remains a challenge, in which the network has to learn to detect novel classes sequentially. Catastrophic forgetting is the main problem underlying this difficulty, as neural networks tend to detect new classes only when training samples for old classes are absent. In this paper, we propose the Replay-and-Transfer Network (RT-Net) to address this issue and accomplish CIOD. We develop a generative replay model to replay features of old classes during learning of new ones for the RoI (Region of Interest) head, using the stored latent feature distributions. To overcome the drastic changes of the RoI feature space, guided feature distillation and feature translation are introduced to facilitate knowledge transfer from the old model to the new one. In addition, we propose holistic ranking transfer, which transfers ranking orders of proposals to the new model, to enable the region proposal network to identify high quality proposals for old classes. Importantly, this framework provides a general solution for CIOD, which can be successfully applied to two task settings: set-overlapped, in which the old and new training sets are overlapped, and set-disjoint, in which the old and new tasks have unique samples. Extensive experiments on standard benchmark datasets including PASCAL VOC and COCO show that RT-Net can achieve state-of-the-art performance for CIOD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Closer Look at Few-Shot Object Detection

Few-Shot Object Detection with Model Calibration

Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training

References

Belouadah E, Popescu A, Kanellos I (2021) A comprehensive study of class incremental learning algorithms for visual tasks. Neural Netw 135:38–54
Article Google Scholar
Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
Castro FM, Marín-Jiménez MJ, Guil N, Schmid C, Alahari K (2018) End-to-end incremental learning. In: Proceedings of the European conference on computer vision (ECCV), pp 233–248
Chen Y, Wang N, Zhang Z (2018) Darkrank: Accelerating deep metric learning via cross sample similarities transfer. In: Proceedings of the AAAI conference on artificial intelligence, pp 2852–2859
Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath A A (2018) Generative adversarial networks: an overview. IEEE Signal Process Mag 35(1):53–65
Article Google Scholar
Dai X, Yuan X, Wei X (2021) Tirnet: Object detection in thermal infrared images for autonomous driving. Appl Intell 51(3):1244–1261
Article Google Scholar
Delange M, Aljundi R, Masana M, Parisot S, Jia X, Leonardis A, Slabaugh G, Tuytelaars T (2021) A continual learning survey: Defying forgetting in classification tasks. IEEE Trans Pattern Anal Mach Intell
Everingham M, Van Gool L, Williams C K, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Article Google Scholar
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2015) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158
Article Google Scholar
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Proceeding of the advances in neural information processing, pp 2672–2680
Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, Liu T, Wang X, Wang G, Cai J et al (2018) Recent advances in convolutional neural networks. Pattern Recognit 77:354–377
Article Google Scholar
Hao Y, Fu Y, Jiang YG, Tian Q (2019) An end-to-end architecture for class-incremental object detection with knowledge distillation. In: Proceedings of the IEEE international conference on multimedia & expo (ICME), pp 1–6
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He K, Gkioxari G, Dollár P, Girshick R (2018) Mask r-cnn. IEEE Trans Pattern Anal Mach Intell 42(2):386–397
Article Google Scholar
He Z, Ren Z, Yang X, Yang Y, Zhang W (2021) Mead: a mask-guided anchor-free detector for oriented aerial object detection. Appl Intell:1–16
Iscen A, Zhang J, Lazebnik S, Schmid C (2020) Memory-efficient incremental learning through feature adaptation. In: Proceedings of the European conference on computer vision (ECCV), pp 699–715
Joseph K, Khan S, Khan FS, Balasubramanian VN (2021) Towards open world object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5830–5840
Kemker R, Kanan C (2018) Fearnet: Brain-inspired model for incremental learning. In: Proceedings of the international conference on learning representations
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu A A, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2017) Overcoming catastrophic forgetting in neural networks. Proc Nat Acad Sci USA 114(13):3521–3526
Article MathSciNet MATH Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceeding of the advances in neural information processing, pp 1097–1105
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
Leng J, Liu Y (2021) Context augmentation for object detection. Appl Intell:1–13
Li Z, Hoiem D (2017) Learning without forgetting. IEEE Trans Pattern Anal Mach Intell 40 (12):2935–2947
Article Google Scholar
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: Proceedings of the European conference on computer vision (ECCV), pp 740–755
Lin T Y, Goyal P, Girshick R, He K, Dollár P (2018) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 42(2):318–327
Article Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: Proceedings of the European conference on computer vision (ECCV), pp 21–37
McCloskey M, Cohen N J (1989) Catastrophic interference in connectionist networks: The sequential learning problem. In: Psychol learn motivat, vol 24, pp 109–165
Parisi G I, Kemker R, Part J L, Kanan C, Wermter S (2019) Continual lifelong learning with neural networks: a review. Neural Netw 113:54–71
Article Google Scholar
Peng C, Zhao K, Lovell B C (2020) Faster ilod: Incremental learning for object detectors based on faster rcnn. Pattern Recognit Lett 140:109–115
Article Google Scholar
Pont-Tuset J, Arbelaez P, Barron J T, Marques F, Malik J (2016) Multiscale combinatorial grouping for image segmentation and object proposal generation. IEEE Trans Pattern Anal Mach Intell 39(1):128–140
Article Google Scholar
Ramakrishnan K, Panda R, Fan Q, Henning J, Oliva A, Feris R (2020) Relationship matters: Relation guided knowledge transfer for incremental learning of object detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops
Rebuffi SA, Kolesnikov A, Sperl G, Lampert CH (2017) icarl: Incremental classifier and representation learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2001–2010
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Ren S, He K, Girshick R, Sun J (2016a) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Ren S, He K, Girshick R, Zhang X, Sun J (2016b) Object detection networks on convolutional feature maps. IEEE Trans Pattern Anal Mach Intell 39(7):1476–1481
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg A C, Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Article MathSciNet Google Scholar
Shin H, Lee J K, Kim J, Kim J (2017) Continual learning with deep generative replay. In: Proceeding of the advances in neural information processing
Shmelkov K, Schmid C, Alahari K (2017) Incremental learning of object detectors without catastrophic forgetting. In: Proceedings of the IEEE international conference on computer vision, pp 3400–3409
Sun W, Dai L, Zhang X, Chang P, He X (2021) Rsod: Real-time small object detection algorithm in uav-based traffic monitoring. Appl Intell:1–16
Tian R, Shi H, Guo B, Zhu L (2021) Multi-scale object detection for high-speed railway clearance intrusion. Appl Intell:1–16
Wan W, Zhong Y, Li T, Chen J (2018) Rethinking feature distribution for loss functions in image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9117–9126
Wen Y, Zhang K, Li Z, Qiao Y (2019) A comprehensive study on center loss for deep face recognition. Int J Comput Vis 127(6-7):668–683
Article Google Scholar
Wu Y, Chen Y, Wang L, Ye Y, Liu Z, Guo Y, Fu Y (2019) Large scale incremental learning. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 374–382
Xia F, Liu TY, Wang J, Zhang W, Li H (2008) Listwise approach to learning to rank: theory and algorithm. In: Proceedings of the international conference on machine learning, pp 1192–1199
Xiang Y, Fu Y, Ji P, Huang H (2019) Incremental learning using conditional adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 6619–6628
Zeng G, Chen Y, Cui B, Yu S (2019) Continual learning of context-dependent processing in neural networks. Nat Mach Intell 1(8):364–372
Article Google Scholar
Zhu D, Xia S, Zhao J, Zhou Y, Niu Q, Yao R, Chen Y (2021) Spatial hierarchy perception and hard samples metric learning for high-resolution remote sensing image object detection. Appl Intell:1–16
Zitnick CL, Dollár P (2014) Edge boxes: Locating object proposals from edges. In: Proceedings of the European conference on computer vision (ECCV), pp 391–405

Download references

Acknowledgements

This work was supported by the National Key Research and Development Program of China (2017YFA0105203), the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB32040200) and Beijing Academy of Artificial Intelligence.

Author information

Guyue Hu
Present address: School of Computing, National University of Singapore, Singapore, 117417, Singapore

Authors and Affiliations

Brainnetome Center and National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Bo Cui, Guyue Hu & Shan Yu
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China
Bo Cui & Guyue Hu
CAS Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Beijing, 100190, China
Shan Yu
School of Future Technology, University of Chinese Academy of Sciences, Beijing, 100049, China
Shan Yu

Authors

Bo Cui
View author publications
You can also search for this author in PubMed Google Scholar
Guyue Hu
View author publications
You can also search for this author in PubMed Google Scholar
Shan Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bo Cui.

Ethics declarations

Conflict of Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cui, B., Hu, G. & Yu, S. RT-Net: replay-and-transfer network for class incremental object detection. Appl Intell 53, 8864–8878 (2023). https://doi.org/10.1007/s10489-022-03509-0

Download citation

Accepted: 12 March 2022
Published: 03 August 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10489-022-03509-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RT-Net: replay-and-transfer network for class incremental object detection

Abstract

Access this article

Similar content being viewed by others

A Closer Look at Few-Shot Object Detection

Few-Shot Object Detection with Model Calibration

Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

RT-Net: replay-and-transfer network for class incremental object detection

Abstract

Access this article

Similar content being viewed by others

A Closer Look at Few-Shot Object Detection

Few-Shot Object Detection with Model Calibration

Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation