Abstract
Despite the remarkable performance achieved by DNN-based object detectors, class incremental object detection (CIOD) remains a challenge, in which the network has to learn to detect novel classes sequentially. Catastrophic forgetting is the main problem underlying this difficulty, as neural networks tend to detect new classes only when training samples for old classes are absent. In this paper, we propose the Replay-and-Transfer Network (RT-Net) to address this issue and accomplish CIOD. We develop a generative replay model to replay features of old classes during learning of new ones for the RoI (Region of Interest) head, using the stored latent feature distributions. To overcome the drastic changes of the RoI feature space, guided feature distillation and feature translation are introduced to facilitate knowledge transfer from the old model to the new one. In addition, we propose holistic ranking transfer, which transfers ranking orders of proposals to the new model, to enable the region proposal network to identify high quality proposals for old classes. Importantly, this framework provides a general solution for CIOD, which can be successfully applied to two task settings: set-overlapped, in which the old and new training sets are overlapped, and set-disjoint, in which the old and new tasks have unique samples. Extensive experiments on standard benchmark datasets including PASCAL VOC and COCO show that RT-Net can achieve state-of-the-art performance for CIOD.
Similar content being viewed by others
References
Belouadah E, Popescu A, Kanellos I (2021) A comprehensive study of class incremental learning algorithms for visual tasks. Neural Netw 135:38–54
Cai Z, Vasconcelos N (2018) Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162
Castro FM, Marín-Jiménez MJ, Guil N, Schmid C, Alahari K (2018) End-to-end incremental learning. In: Proceedings of the European conference on computer vision (ECCV), pp 233–248
Chen Y, Wang N, Zhang Z (2018) Darkrank: Accelerating deep metric learning via cross sample similarities transfer. In: Proceedings of the AAAI conference on artificial intelligence, pp 2852–2859
Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath A A (2018) Generative adversarial networks: an overview. IEEE Signal Process Mag 35(1):53–65
Dai X, Yuan X, Wei X (2021) Tirnet: Object detection in thermal infrared images for autonomous driving. Appl Intell 51(3):1244–1261
Delange M, Aljundi R, Masana M, Parisot S, Jia X, Leonardis A, Slabaugh G, Tuytelaars T (2021) A continual learning survey: Defying forgetting in classification tasks. IEEE Trans Pattern Anal Mach Intell
Everingham M, Van Gool L, Williams C K, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2015) Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans Pattern Anal Mach Intell 38(1):142–158
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: Proceeding of the advances in neural information processing, pp 2672–2680
Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, Liu T, Wang X, Wang G, Cai J et al (2018) Recent advances in convolutional neural networks. Pattern Recognit 77:354–377
Hao Y, Fu Y, Jiang YG, Tian Q (2019) An end-to-end architecture for class-incremental object detection with knowledge distillation. In: Proceedings of the IEEE international conference on multimedia & expo (ICME), pp 1–6
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He K, Gkioxari G, Dollár P, Girshick R (2018) Mask r-cnn. IEEE Trans Pattern Anal Mach Intell 42(2):386–397
He Z, Ren Z, Yang X, Yang Y, Zhang W (2021) Mead: a mask-guided anchor-free detector for oriented aerial object detection. Appl Intell:1–16
Iscen A, Zhang J, Lazebnik S, Schmid C (2020) Memory-efficient incremental learning through feature adaptation. In: Proceedings of the European conference on computer vision (ECCV), pp 699–715
Joseph K, Khan S, Khan FS, Balasubramanian VN (2021) Towards open world object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5830–5840
Kemker R, Kanan C (2018) Fearnet: Brain-inspired model for incremental learning. In: Proceedings of the international conference on learning representations
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu A A, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2017) Overcoming catastrophic forgetting in neural networks. Proc Nat Acad Sci USA 114(13):3521–3526
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceeding of the advances in neural information processing, pp 1097–1105
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86:2278–2324
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Leng J, Liu Y (2021) Context augmentation for object detection. Appl Intell:1–13
Li Z, Hoiem D (2017) Learning without forgetting. IEEE Trans Pattern Anal Mach Intell 40 (12):2935–2947
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: Proceedings of the European conference on computer vision (ECCV), pp 740–755
Lin T Y, Goyal P, Girshick R, He K, Dollár P (2018) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 42(2):318–327
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: Proceedings of the European conference on computer vision (ECCV), pp 21–37
McCloskey M, Cohen N J (1989) Catastrophic interference in connectionist networks: The sequential learning problem. In: Psychol learn motivat, vol 24, pp 109–165
Parisi G I, Kemker R, Part J L, Kanan C, Wermter S (2019) Continual lifelong learning with neural networks: a review. Neural Netw 113:54–71
Peng C, Zhao K, Lovell B C (2020) Faster ilod: Incremental learning for object detectors based on faster rcnn. Pattern Recognit Lett 140:109–115
Pont-Tuset J, Arbelaez P, Barron J T, Marques F, Malik J (2016) Multiscale combinatorial grouping for image segmentation and object proposal generation. IEEE Trans Pattern Anal Mach Intell 39(1):128–140
Ramakrishnan K, Panda R, Fan Q, Henning J, Oliva A, Feris R (2020) Relationship matters: Relation guided knowledge transfer for incremental learning of object detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops
Rebuffi SA, Kolesnikov A, Sperl G, Lampert CH (2017) icarl: Incremental classifier and representation learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2001–2010
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Ren S, He K, Girshick R, Sun J (2016a) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Ren S, He K, Girshick R, Zhang X, Sun J (2016b) Object detection networks on convolutional feature maps. IEEE Trans Pattern Anal Mach Intell 39(7):1476–1481
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg A C, Fei-Fei L (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Shin H, Lee J K, Kim J, Kim J (2017) Continual learning with deep generative replay. In: Proceeding of the advances in neural information processing
Shmelkov K, Schmid C, Alahari K (2017) Incremental learning of object detectors without catastrophic forgetting. In: Proceedings of the IEEE international conference on computer vision, pp 3400–3409
Sun W, Dai L, Zhang X, Chang P, He X (2021) Rsod: Real-time small object detection algorithm in uav-based traffic monitoring. Appl Intell:1–16
Tian R, Shi H, Guo B, Zhu L (2021) Multi-scale object detection for high-speed railway clearance intrusion. Appl Intell:1–16
Wan W, Zhong Y, Li T, Chen J (2018) Rethinking feature distribution for loss functions in image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9117–9126
Wen Y, Zhang K, Li Z, Qiao Y (2019) A comprehensive study on center loss for deep face recognition. Int J Comput Vis 127(6-7):668–683
Wu Y, Chen Y, Wang L, Ye Y, Liu Z, Guo Y, Fu Y (2019) Large scale incremental learning. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 374–382
Xia F, Liu TY, Wang J, Zhang W, Li H (2008) Listwise approach to learning to rank: theory and algorithm. In: Proceedings of the international conference on machine learning, pp 1192–1199
Xiang Y, Fu Y, Ji P, Huang H (2019) Incremental learning using conditional adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 6619–6628
Zeng G, Chen Y, Cui B, Yu S (2019) Continual learning of context-dependent processing in neural networks. Nat Mach Intell 1(8):364–372
Zhu D, Xia S, Zhao J, Zhou Y, Niu Q, Yao R, Chen Y (2021) Spatial hierarchy perception and hard samples metric learning for high-resolution remote sensing image object detection. Appl Intell:1–16
Zitnick CL, Dollár P (2014) Edge boxes: Locating object proposals from edges. In: Proceedings of the European conference on computer vision (ECCV), pp 391–405
Acknowledgements
This work was supported by the National Key Research and Development Program of China (2017YFA0105203), the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB32040200) and Beijing Academy of Artificial Intelligence.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Cui, B., Hu, G. & Yu, S. RT-Net: replay-and-transfer network for class incremental object detection. Appl Intell 53, 8864–8878 (2023). https://doi.org/10.1007/s10489-022-03509-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03509-0