Abstract
This paper proposes a novel approach to enhancing open-world object detection (OWOD) models by combining artificial intelligence generated content and elastic weight consolidation (EWC). To address the issue of low category richness and mitigate catastrophic forgetting, we first utilize stable diffusion with low-rank adaptation (LoRA) fine-tuning to generate customized detection target datasets. These datasets are then employed to train an improved open-world region-based efficient model, incorporating an EWC module to constrain parameter changes during learning new tasks. Experimental results demonstrate that our approach achieves a mean average precision of 84.7% on the generated datasets, significantly improving category richness while mitigating forgetting of previously learned categories. The proposed method effectively balances learning new categories and retaining memory of old ones, advancing the frontiers of OWOD research.












Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data availability
The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.
Code availability
The code used in this study is available from the corresponding author upon reasonable request.
Materials availability
Materials used in this study are available from the corresponding author upon reasonable request.
References
Ren S, He K, Girshick R et al (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934. https://doi.org/10.48550/arXiv.2004.10934
Wang C-Y, Bochkovskiy A, Liao H-YM (2022) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv e-prints https://doi.org/10.48550/arXiv.2207.02696
Joseph KJ, Khan S, Khan FS et al (2021) Towards open world object detection. arXiv preprint arXiv:2103.02603. https://doi.org/10.48550/arXiv.2103.02603
Li Y, Wang Y, Wang W, Lin D, Li B, Yap K-H (2024) Open world object detection: a survey. arXiv:2410.11301
Feng T, Wang M, Yuan H (2022) Overcoming catastrophic forgetting in incremental object detection via elastic response distillation. arXiv:2204.02136
Dong N, Zhang Y, Ding M, Lee GH (2023) Incremental-DETR: incremental few-shot object detection via self-supervised learning. arXiv:2205.04042
Dong B, Huang Z, Yang G, Zhang L, Zuo W (2024) MR-GDINO: efficient open-world continual object detection. arXiv:2412.15979
Goodfellow I, Pouget-Abadie J, Mirza M et al (2020) Generative adversarial networks. Commun ACM 63(11):139–144
Mirza M, Osindero S (2014) Conditional Generative Adversarial Nets. https://doi.org/10.48550/arXiv.1411.1784. arXiv preprint arXiv:1411.1784
Zhu J-Y, Park T, Isola P et al (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy. IEEE, New York, pp 2223–2232
Yang W, An H, Hu W et al (2024) Text-guided floral image generation based on lightweight deep attention feature fusion gan. Vis Comput. https://doi.org/10.1007/s00371-024-03617-7
Song J, Meng C, Ermon S (2020) Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502
Radford A, Kim JW, Hallacy C et al (2021) Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, Vienna, Austria. PMLR, New York, pp 8748–8763
Rombach R, Blattmann A, Lorenz D et al (2022) High-resolution image synthesis with latent diffusion models. arXiv preprint arXiv:2112.10752
Hu EJ, Shen Y, Wallis P et al (2021) Lora: low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A (2016) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci USA 114(13):3521–3526
Kingma DP, Welling M (2014) Auto-encoding variational bayes. https://doi.org/10.48550/arXiv.1312.6114. arXiv preprint arXiv:1312.6114
Radford A, Kim JW, Hallacy C et al (2021) Learning transferable visual models from natural language supervision. https://doi.org/10.48550/arXiv.2103.00020. arXiv preprint arXiv:2103.00020
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp 234–241. https://doi.org/10.1007/978-3-319-24574-4_28
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I (2017) Attention Is All You Need. https://doi.org/10.48550/arXiv.1706.03762. arXiv preprint arXiv:1706.03762
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Kirkpatrick J, Pascanu R, Rabinowitz N et al (2016) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci 114(13):3521–3526. https://doi.org/10.1073/pnas.1611835114
Funding
This work was supported by Natural Science Foundation of Tianjin Municipality, China (22JCYBJC01470).
Author information
Authors and Affiliations
Contributions
WX worked in software, methodology, and writing—original draft. GX contributed to writing—review and editing, data curation, and project administration. NY contributed to writing—review and editing. JL helped in funding acquisition.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Xue, W., Xu, G., Yang, N. et al. Enhancing open-world object detection with AIGC-generated datasets and elastic weight consolidation. J Supercomput 81, 417 (2025). https://doi.org/10.1007/s11227-024-06910-3
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-024-06910-3