Boosting Dense Long-Tailed Object Detection from Data-Centric View

Xu, Weichen; Cao, Jian; Fu, Tianhao; Yao, Hongyi; Wang, Yuan

doi:10.1007/978-3-031-26313-2_34

Weichen Xu¹²,
Jian Cao¹²,
Tianhao Fu¹²,
Hongyi Yao¹² &
…
Yuan Wang¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13843))

Included in the following conference series:

Asian Conference on Computer Vision

950 Accesses

Abstract

Several re-sampling and re-weighting approaches have been proposed in recent literature to address long-tailed object detection. However, state-of-the-art approaches still struggle on the rare class. From data-centric view, this is due to few training data of the rare class and data imbalance. Some data augmentations which could generate more training data perform well in general object detection, while they are hardly leveraged in long-tailed object detection. We reveal that the real culprit lies in the fact that data imbalance has not been alleviated or even intensified. In this paper, we propose REDet: a rare data centric detection framework which could simultaneously generate training data of the rare class and deal with data imbalance. Our REDet contains data operations at two levels. At the instance-level, Copy-Move data augmentation could independently rebalance the number of instances of different classes according to their rarity. Specifically, we copy instances of the rare class in an image and then move them to other locations in the same image. At the anchor-level, to generate more supervision for the rare class within a reasonable range, we propose Long-Tailed Training Sample Selection (LTTSS) to dynamically determine the corresponding positive samples for each instance based on the rarity of the class. Comprehensive experiments performed on the challenging LVIS v1 dataset demonstrate the effectiveness of our proposed approach. We achieve an overall 30.2% AP and obtain significant performance improvements on the rare class.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Enhancing Source-Free Domain Adaptive Object Detection with Low-Confidence Pseudo Label Distillation

Rectify the Regression Bias in Long-Tailed Object Detection

Adapting on Long-Tail Domains by High Quality Self-training for Object Detection

References

Zhang, Y., Kang, B., Hooi, B., Yan, S., Feng, J.: Deep long-tailed learning: A survey. arXiv preprint arXiv:2110.04596 (2021)
Gupta, A., Dollar, P., Girshick, R.: Lvis: a dataset for large vocabulary instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Wu, J., Song, L., Wang, T., Zhang, Q., Yuan, J.: Forest R-CNN: large-vocabulary long-tailed object detection and instance segmentation. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 1570–1578. Association for Computing Machinery, New York (2020)
Google Scholar
Wang, T., et al.: The devil is in classification: a simple framework for long-tail instance segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 728–744. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58568-6_43
Chapter Google Scholar
Feng, C., Zhong, Y., Huang, W.: Exploring classification equilibrium in long-tailed object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3417–3426 (2021)
Google Scholar
Tan, J., et al.: Equalization loss for long-tailed object recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Wang, J., et al.: Seesaw loss for long-tailed instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9695–9704 (2021)
Google Scholar
Tan, J., Lu, X., Zhang, G., Yin, C., Li, Q.: Equalization loss v2: a new gradient balance approach for long-tailed object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 685–1694 (2021)
Google Scholar
Li, B., et al.: Equalized focal loss for dense long-tailed object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6990–6999 (2022)
Google Scholar
Hu, X., Jiang, Y., Tang, K., Chen, J., Miao, C., Zhang, H.: Learning to segment the tail. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Ghiasi, G., et al.: Simple copy-paste is a strong data augmentation method for instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2918–2928 (2021)
Google Scholar
Kang, B., et al.: Decoupling representation and classifier for long-tailed recognition. In: International Conference on Learning Representations (2020)
Google Scholar
Li, Y., et al.: Overcoming classifier imbalance for long-tail object detection with balanced group softmax. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Rogers, A.: Changing the world by changing the data. In: ACL (2021)
Google Scholar
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. In: International Conference on Learning Representations (2018)
Google Scholar
Fang, H.S., Sun, J., Wang, R., Gou, M., Li, Y.L., Lu, C.: Instaboost: boosting instance segmentation via probability map guided copy-pasting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 25. Curran Associates, Inc. (2012)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28. Curran Associates, Inc. (2015)
Google Scholar
Cai, Z., Vasconcelos, N.: Cascade r-cnn: delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollar, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2017)
Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: Cutmix: regularization strategy to train strong classifiers with localizable features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Alexey, B., Wang, C.Y., Liao, H.Y.M.: Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Zhong, Z., Cui, J., Liu, S., Jia, J.: Improving calibration for long-tailed recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16489–16498 (2021)
Google Scholar
Zang, Y., Huang, C., Loy, C.C.: FASA: feature augmentation and sampling adaptation for long-tailed instance segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 3457–3466 (2021)
Google Scholar
Zhang, S., Chi, C., Yao, Y., Lei, Z., Li, S.Z.: Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Google Scholar

Download references

Acknowledgments

This work was supported by the National Key Research and Development Program of China (Grant No. 2018YFE0203801).

Author information

Authors and Affiliations

Peking University, Beijing, China
Weichen Xu, Jian Cao, Tianhao Fu, Hongyi Yao & Yuan Wang

Authors

Weichen Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jian Cao
View author publications
You can also search for this author in PubMed Google Scholar
Tianhao Fu
View author publications
You can also search for this author in PubMed Google Scholar
Hongyi Yao
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Cao .

Editor information

Editors and Affiliations

University of Wollongong, Wollongong, NSW, Australia
Lei Wang
University of Bonn, Bonn, Germany
Juergen Gall
University of Adelaide, Adelaide, SA, Australia
Tat-Jun Chin
National Institute of Informatics, Tokyo, Japan
Imari Sato
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 153 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, W., Cao, J., Fu, T., Yao, H., Wang, Y. (2023). Boosting Dense Long-Tailed Object Detection from Data-Centric View. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13843. Springer, Cham. https://doi.org/10.1007/978-3-031-26313-2_34

Download citation

DOI: https://doi.org/10.1007/978-3-031-26313-2_34
Published: 02 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26312-5
Online ISBN: 978-3-031-26313-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Boosting Dense Long-Tailed Object Detection from Data-Centric View