Enhancing open-world object detection with AIGC-generated datasets and elastic weight consolidation

Xue, Wenjin; Xu, Guowei; Yang, Nan; Liu, Jian

doi:10.1007/s11227-024-06910-3

Enhancing open-world object detection with AIGC-generated datasets and elastic weight consolidation

Published: 21 January 2025

Volume 81, article number 417, (2025)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Wenjin Xue¹,
Guowei Xu^1,2,
Nan Yang¹ &
…
Jian Liu²

201 Accesses
Explore all metrics

Abstract

This paper proposes a novel approach to enhancing open-world object detection (OWOD) models by combining artificial intelligence generated content and elastic weight consolidation (EWC). To address the issue of low category richness and mitigate catastrophic forgetting, we first utilize stable diffusion with low-rank adaptation (LoRA) fine-tuning to generate customized detection target datasets. These datasets are then employed to train an improved open-world region-based efficient model, incorporating an EWC module to constrain parameter changes during learning new tasks. Experimental results demonstrate that our approach achieves a mean average precision of 84.7% on the generated datasets, significantly improving category richness while mitigating forgetting of previously learned categories. The proposed method effectively balances learning new categories and retaining memory of old ones, advancing the frontiers of OWOD research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Collaborative Training Between Region Proposal Localization and Classification for Domain Adaptive Object Detection

Adapting on Long-Tail Domains by High Quality Self-training for Object Detection

Enhancing Source-Free Domain Adaptive Object Detection with Low-Confidence Pseudo Label Distillation

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availability

The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.

Code availability

The code used in this study is available from the corresponding author upon reasonable request.

Materials availability

Materials used in this study are available from the corresponding author upon reasonable request.

References

Ren S, He K, Girshick R et al (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Article MATH Google Scholar
Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934. https://doi.org/10.48550/arXiv.2004.10934
Wang C-Y, Bochkovskiy A, Liao H-YM (2022) Yolov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv e-prints https://doi.org/10.48550/arXiv.2207.02696
Joseph KJ, Khan S, Khan FS et al (2021) Towards open world object detection. arXiv preprint arXiv:2103.02603. https://doi.org/10.48550/arXiv.2103.02603
Li Y, Wang Y, Wang W, Lin D, Li B, Yap K-H (2024) Open world object detection: a survey. arXiv:2410.11301
Feng T, Wang M, Yuan H (2022) Overcoming catastrophic forgetting in incremental object detection via elastic response distillation. arXiv:2204.02136
Dong N, Zhang Y, Ding M, Lee GH (2023) Incremental-DETR: incremental few-shot object detection via self-supervised learning. arXiv:2205.04042
Dong B, Huang Z, Yang G, Zhang L, Zuo W (2024) MR-GDINO: efficient open-world continual object detection. arXiv:2412.15979
Goodfellow I, Pouget-Abadie J, Mirza M et al (2020) Generative adversarial networks. Commun ACM 63(11):139–144
Article MathSciNet MATH Google Scholar
Mirza M, Osindero S (2014) Conditional Generative Adversarial Nets. https://doi.org/10.48550/arXiv.1411.1784. arXiv preprint arXiv:1411.1784
Zhu J-Y, Park T, Isola P et al (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy. IEEE, New York, pp 2223–2232
Yang W, An H, Hu W et al (2024) Text-guided floral image generation based on lightweight deep attention feature fusion gan. Vis Comput. https://doi.org/10.1007/s00371-024-03617-7
Article Google Scholar
Song J, Meng C, Ermon S (2020) Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502
Radford A, Kim JW, Hallacy C et al (2021) Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, Vienna, Austria. PMLR, New York, pp 8748–8763
Rombach R, Blattmann A, Lorenz D et al (2022) High-resolution image synthesis with latent diffusion models. arXiv preprint arXiv:2112.10752
Hu EJ, Shen Y, Wallis P et al (2021) Lora: low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A (2016) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci USA 114(13):3521–3526
Article MathSciNet MATH Google Scholar
Kingma DP, Welling M (2014) Auto-encoding variational bayes. https://doi.org/10.48550/arXiv.1312.6114. arXiv preprint arXiv:1312.6114
Radford A, Kim JW, Hallacy C et al (2021) Learning transferable visual models from natural language supervision. https://doi.org/10.48550/arXiv.2103.00020. arXiv preprint arXiv:2103.00020
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp 234–241. https://doi.org/10.1007/978-3-319-24574-4_28
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I (2017) Attention Is All You Need. https://doi.org/10.48550/arXiv.1706.03762. arXiv preprint arXiv:1706.03762
He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Kirkpatrick J, Pascanu R, Rabinowitz N et al (2016) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci 114(13):3521–3526. https://doi.org/10.1073/pnas.1611835114
Article MathSciNet MATH Google Scholar

Download references

Funding

This work was supported by Natural Science Foundation of Tianjin Municipality, China (22JCYBJC01470).

Author information

Authors and Affiliations

School of Control Science and Engineering, Tiangong University, Tianjin, 300387, China
Wenjin Xue, Guowei Xu & Nan Yang
School of Mechanical Engineering, Tiangong University, Tianjin, 300387, China
Guowei Xu & Jian Liu

Authors

Wenjin Xue
View author publications
You can also search for this author inPubMed Google Scholar
Guowei Xu
View author publications
You can also search for this author inPubMed Google Scholar
Nan Yang
View author publications
You can also search for this author inPubMed Google Scholar
Jian Liu
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

WX worked in software, methodology, and writing—original draft. GX contributed to writing—review and editing, data curation, and project administration. NY contributed to writing—review and editing. JL helped in funding acquisition.

Corresponding author

Correspondence to Guowei Xu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Xue, W., Xu, G., Yang, N. et al. Enhancing open-world object detection with AIGC-generated datasets and elastic weight consolidation. J Supercomput 81, 417 (2025). https://doi.org/10.1007/s11227-024-06910-3

Download citation

Accepted: 28 December 2024
Published: 21 January 2025
DOI: https://doi.org/10.1007/s11227-024-06910-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhancing open-world object detection with AIGC-generated datasets and elastic weight consolidation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Collaborative Training Between Region Proposal Localization and Classification for Domain Adaptive Object Detection

Adapting on Long-Tail Domains by High Quality Self-training for Object Detection

Enhancing Source-Free Domain Adaptive Object Detection with Low-Confidence Pseudo Label Distillation

Explore related subjects

Data availability

Code availability

Materials availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now