Skip to main content

Advertisement

Log in

Enhancing open-world object detection with AIGC-generated datasets and elastic weight consolidation

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

This paper proposes a novel approach to enhancing open-world object detection (OWOD) models by combining artificial intelligence generated content and elastic weight consolidation (EWC). To address the issue of low category richness and mitigate catastrophic forgetting, we first utilize stable diffusion with low-rank adaptation (LoRA) fine-tuning to generate customized detection target datasets. These datasets are then employed to train an improved open-world region-based efficient model, incorporating an EWC module to constrain parameter changes during learning new tasks. Experimental results demonstrate that our approach achieves a mean average precision of 84.7% on the generated datasets, significantly improving category richness while mitigating forgetting of previously learned categories. The proposed method effectively balances learning new categories and retaining memory of old ones, advancing the frontiers of OWOD research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availability

The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.

Code availability

The code used in this study is available from the corresponding author upon reasonable request.

Materials availability

Materials used in this study are available from the corresponding author upon reasonable request.

References

  1. Ren S, He K, Girshick R et al (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031

    Article  MATH  Google Scholar 

  2. Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934. https://doi.org/10.48550/arXiv.2004.10934

  3. Wang C-Y, Bochkovskiy A, Liao H-YM (2022) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv e-prints https://doi.org/10.48550/arXiv.2207.02696

  4. Joseph KJ, Khan S, Khan FS et al (2021) Towards open world object detection. arXiv preprint arXiv:2103.02603. https://doi.org/10.48550/arXiv.2103.02603

  5. Li Y, Wang Y, Wang W, Lin D, Li B, Yap K-H (2024) Open world object detection: a survey. arXiv:2410.11301

  6. Feng T, Wang M, Yuan H (2022) Overcoming catastrophic forgetting in incremental object detection via elastic response distillation. arXiv:2204.02136

  7. Dong N, Zhang Y, Ding M, Lee GH (2023) Incremental-DETR: incremental few-shot object detection via self-supervised learning. arXiv:2205.04042

  8. Dong B, Huang Z, Yang G, Zhang L, Zuo W (2024) MR-GDINO: efficient open-world continual object detection. arXiv:2412.15979

  9. Goodfellow I, Pouget-Abadie J, Mirza M et al (2020) Generative adversarial networks. Commun ACM 63(11):139–144

    Article  MathSciNet  MATH  Google Scholar 

  10. Mirza M, Osindero S (2014) Conditional Generative Adversarial Nets. https://doi.org/10.48550/arXiv.1411.1784. arXiv preprint arXiv:1411.1784

  11. Zhu J-Y, Park T, Isola P et al (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy. IEEE, New York, pp 2223–2232

  12. Yang W, An H, Hu W et al (2024) Text-guided floral image generation based on lightweight deep attention feature fusion gan. Vis Comput. https://doi.org/10.1007/s00371-024-03617-7

    Article  Google Scholar 

  13. Song J, Meng C, Ermon S (2020) Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502

  14. Radford A, Kim JW, Hallacy C et al (2021) Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, Vienna, Austria. PMLR, New York, pp 8748–8763

  15. Rombach R, Blattmann A, Lorenz D et al (2022) High-resolution image synthesis with latent diffusion models. arXiv preprint arXiv:2112.10752

  16. Hu EJ, Shen Y, Wallis P et al (2021) Lora: low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685

  17. Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A (2016) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci USA 114(13):3521–3526

    Article  MathSciNet  MATH  Google Scholar 

  18. Kingma DP, Welling M (2014) Auto-encoding variational bayes. https://doi.org/10.48550/arXiv.1312.6114. arXiv preprint arXiv:1312.6114

  19. Radford A, Kim JW, Hallacy C et al (2021) Learning transferable visual models from natural language supervision. https://doi.org/10.48550/arXiv.2103.00020. arXiv preprint arXiv:2103.00020

  20. Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp 234–241. https://doi.org/10.1007/978-3-319-24574-4_28

  21. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I (2017) Attention Is All You Need. https://doi.org/10.48550/arXiv.1706.03762. arXiv preprint arXiv:1706.03762

  22. He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770–778. https://doi.org/10.1109/CVPR.2016.90

  23. Kirkpatrick J, Pascanu R, Rabinowitz N et al (2016) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci 114(13):3521–3526. https://doi.org/10.1073/pnas.1611835114

    Article  MathSciNet  MATH  Google Scholar 

Download references

Funding

This work was supported by Natural Science Foundation of Tianjin Municipality, China (22JCYBJC01470).

Author information

Authors and Affiliations

Authors

Contributions

WX worked in software, methodology, and writing—original draft. GX contributed to writing—review and editing, data curation, and project administration. NY contributed to writing—review and editing. JL helped in funding acquisition.

Corresponding author

Correspondence to Guowei Xu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xue, W., Xu, G., Yang, N. et al. Enhancing open-world object detection with AIGC-generated datasets and elastic weight consolidation. J Supercomput 81, 417 (2025). https://doi.org/10.1007/s11227-024-06910-3

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-024-06910-3

Keywords