skip to main content
10.1145/3604078.3604120acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicdipConference Proceedingsconference-collections
research-article

Density-aware Object Detection in Aerial Images

Published:26 October 2023Publication History

ABSTRACT

Detecting densely arranged objects is challenging due to the lack of generic definitions and the feature coupling between nearby objects. This paper proposes mathematical definitions of the instance-level, image-level, and dataset-level object density by information theory, called Density Index (DI). The DI shows a high consistency with human perception, serving as a powerful guide for aerial object detection, including data assessment and detector customization. Under the guidance of the DI, we design a DeDet to enhance the detector's performance in detecting densely arranged objects. DeDet pursues accurate location for densely arranged objects by the Density-aware Label Assignment (DLA) and Density-aware Feature Extraction (DFE), conquering the heuristic that the sample assignment and feature extraction are performed independently for each object. Experiments on the DOTA-v1.0 and DOTA-v2.0 show that DeDet can bring a significant improvement to the baseline detector.

References

  1. S. Shao, Z. Zhao, B. Li, T. Xiao, G. Yu, X. Zhang, and J. Sun, “Crowdhuman: A benchmark for detecting human in a crowd,” arXiv preprint arXiv:1805.00123, 2018.Google ScholarGoogle Scholar
  2. E. Goldman, R. Herzig, A. Eisenschtat, J. Goldberger, and T. Hassner, “Precise detection in densely packed scenes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 5227–5236.Google ScholarGoogle ScholarCross RefCross Ref
  3. G.-S. Xia, X. Bai, J. Ding, Z. Zhu, S. Belongie, J. Luo, M. Datcu, M. Pelillo, and L. Zhang, “DOTA: A large-scale dataset for object detection in aerial images,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 3974–3983.Google ScholarGoogle ScholarCross RefCross Ref
  4. J. Ding, N. Xue, G.-S. Xia, X. Bai, W. Yang, M. Y. Yang, S. Belongie, J. Luo, M. Datcu, M. Pelillo , “Object detection in aerial images: A large-scale benchmark and challenges,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 11, pp. 7778–7796, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  5. S. Liu, D. Huang, and Y. Wang, “Adaptive nms: Refining pedestrian detection in a crowd,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 6459–6468.Google ScholarGoogle ScholarCross RefCross Ref
  6. X. Yang, J. Yang, J. Yan, Y. Zhang, T. Zhang, Z. Guo, X. Sun, and K. Fu, “SCRDet: Towards more robust detection for small, cluttered and rotated objects,” in IEEE International Conference on Computer Vision, 2019, pp. 8232–8241.Google ScholarGoogle ScholarCross RefCross Ref
  7. X. Pan, Y. Ren, K. Sheng, W. Dong, H. Yuan, X. Guo, C. Ma, and C. Xu, “Dynamic refinement network for oriented and densely packed object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11207–11216.Google ScholarGoogle ScholarCross RefCross Ref
  8. Z. Guo, C. Liu, X. Zhang, J. Jiao, X. Ji, and Q. Ye, “Beyond bounding- box: Convex-hull feature adaptation for oriented and densely packed object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 8792–8801.Google ScholarGoogle ScholarCross RefCross Ref
  9. D. M. Endres and J. E. Schindelin, “A new metric for probability distributions,” IEEE Transactions on Information Theory (TIT), vol. 49, no. 7, pp. 1858–1860, 2003.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. F. Nielsen, “On a generalization of the jensen–shannon divergence and the jensen–shannon centroid,” Entropy, vol. 22, no. 2, p. 221, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  11. Y. Liu, L. Geng, W. Zhang, Y. Gong, and Z. Xu, “Survey of video based small target detection,” Journal of Image and Graphics, vol. 9, no. 4, pp. 122–134, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  12. E. Lo, “Target detection algorithms in hyperspectral imaging based on discriminant analysis,” Journal of Image and Graphics, vol. 7, no. 4, pp. 140–144, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  13. F. Utaminingrum and R. P. Prasetya, “Combining multiple feature for robust traffic sign detection,” Journal of Image and Graphics, vol. 8, no. 2, pp. 53–58, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  14. R. Khan, T. F. Raisa, and R. Debnath, “An efficient contour based fine-grained algorithm for multi category object detection,” Journal of Image and Graphics, vol. 6, no. 2, pp. 127–136, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  15. J. Ma, W. Shao, H. Ye, L. Wang, H. Wang, Y. Zheng, and X. Xue. “Arbitrary-oriented scene text detection via rotation proposals,” IEEE Transactions on Multimedia, 20(11):3111–3122, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Ding, N. Xue, Y. Long, G.-S. Xia, and Q. Lu, “Learning roi transformer for detecting oriented objects in aerial images,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 2849–2858.Google ScholarGoogle Scholar
  17. X. Xie, G. Cheng, J. Wang, X. Yao, and J. Han, “Oriented r-cnn for object detection,” in IEEE International Conference on Computer Vision, 2021, pp. 3520-3529.Google ScholarGoogle ScholarCross RefCross Ref
  18. Y. Xu, M. Fu, Q. Wang, Y. Wang, K. Chen, G.-S. Xia, and X. Bai, “Gliding vertex on the horizontal bounding box for multi-oriented object detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.Google ScholarGoogle Scholar
  19. J. Wang, J. Ding, H. Guo, W. Cheng, T. Pan, and W. Yang, “Mask OBB: A semantic attention-based mask oriented bounding box representation for multi-category object detection in aerial images,” Remote Sensing, vol. 11, no. 24, p. 2930, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  20. J. Wang, W. Yang, H.-c. Li, H. Zhang, and G.-S. Xia, “Learning center probability map for detecting objects in aerial images,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 5, pp. 4307–4323, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  21. T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, “Focal loss for dense object detection,” in IEEE International Conference on Computer Vision, 2017, pp. 2980–2988.Google ScholarGoogle ScholarCross RefCross Ref
  22. Z. Tian, C. Shen, H. Chen, and T. He, “Fcos: A simple and strong anchor-free object detector,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  23. X. Zhou, D. Wang, and P. Kr ̈ahenb ̈uhl, “Objects as points,” arXiv preprint arXiv:1904.07850, 2019.Google ScholarGoogle Scholar
  24. X. Yang, J. Yan, Z. Feng, and T. He, “R3det: Refined single-stage detector with feature refinement for rotating object,” in AAAI Conference on Artificial Intelligence, vol. 35, no. 4, 2021, pp. 3163–317.Google ScholarGoogle ScholarCross RefCross Ref
  25. J. Han, J. Ding, J. Li, and G.-S. Xia, “Align deep features for oriented object detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–11, 2021.Google ScholarGoogle Scholar
  26. Q. Ming, Z. Zhou, L. Miao, H. Zhang, and L. Li, “Dynamic anchor learning for arbitrary-oriented object detection,” in AAAI Conference on Artificial Intelligence, vol. 35, no. 3, 2021, pp. 2355–2363.Google ScholarGoogle ScholarCross RefCross Ref
  27. L. Hou, K. Lu, J. Xue, and Y. Li, “Shape-adaptive selection and measurement for oriented object detection,” in AAAI Conference on Artificial Intelligence, 2022.Google ScholarGoogle ScholarCross RefCross Ref
  28. W. Li, Y. Chen, K. Hu, and J. Zhu, “Oriented reppoints for aerial object detection,” in IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 1829–1838.Google ScholarGoogle ScholarCross RefCross Ref
  29. J. Lin, “Divergence measures based on the shannon entropy,” IEEE Transactions on Information theory, vol. 37, no. 1, pp. 145–151, 1991.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Y. Li, “Detecting lesion bounding ellipses with gaussian proposal networks,” in International Workshop on Machine Learning in Medical Imaging. Springer, 2019, pp. 337–344.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. C. Xu, J. Wang, W. Yang, H. Yu, L. Yu, and G.-S. Xia, “Detecting tiny objects in aerial images: A normalized wasserstein distance and a new benchmark,” in ISPRS Journal of Photogrammetry and Remote Sensing, vol. 190, 2022, pp. 79–93.Google ScholarGoogle ScholarCross RefCross Ref
  32. X. Yang, G. Zhang, X. Yang, Y. Zhou, W. Wang, J. Tang, T. He, and J. Yan, “Detecting rotated objects as gaussian distributions and its 3d generalization,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real- time object detection with region proposal networks,” in Advances in Neural Information Processing Systems, 2015, pp. 91–99.Google ScholarGoogle Scholar
  34. S. Zhang, C. Chi, Y. Yao, Z. Lei, and S. Z. Li, “Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 9759–9768.Google ScholarGoogle ScholarCross RefCross Ref
  35. X. Yang, X. Yang, J. Yang, Q. Ming, W. Wang, Q. Tian, and J. Yan, “Learning high-precision bounding box for rotated object detection via kullback-leibler divergence,” Advances in Neural Information Processing Systems, vol. 34, 2021.Google ScholarGoogle Scholar
  36. A. Paszke, S. Gross, F. Massa, A. Lerer , “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems, 2019, pp. 8024–8035.Google ScholarGoogle Scholar
  37. Y. Zhou, X. Yang, G. Zhang, J. Wang, Y. Liu, L. Hou, X. Jiang, X. Liu, J. Yan, C. Lyu , “Mmrotate: A rotated object detection benchmark using pytorch,” arXiv preprint arXiv:2204.13317, 2022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein , “Imagenet large scale visual recognition challenge,” International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. K. He, G. Gkioxari, P. Dollar, and R. Girshick, “Mask R-CNN,” in IEEE International Conference on Computer Vision, 2017, pp. 2961–2969.Google ScholarGoogle ScholarCross RefCross Ref
  40. K. Chen, J. Pang, J. Wang, Y. Xiong, X. Li, S. Sun, W. Feng, Z. Liu, J. Shi, W. Ouyang , “Hybrid task cascade for instance segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4974–4983.Google ScholarGoogle ScholarCross RefCross Ref
  41. J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, and Y. Wei, “Deformable convolutional networks,” in IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 764–773.Google ScholarGoogle ScholarCross RefCross Ref
  42. Z. Chen, K. Chen, W. Lin, J. See, H. Yu, Y. Ke, and C. Yang, “Piou loss: Towards accurate oriented object detection in complex environments,” in European Conference on Computer Vision, 2020, pp. 195–211.Google ScholarGoogle Scholar

Index Terms

  1. Density-aware Object Detection in Aerial Images

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      ICDIP '23: Proceedings of the 15th International Conference on Digital Image Processing
      May 2023
      711 pages
      ISBN:9798400708237
      DOI:10.1145/3604078

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 26 October 2023

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited
    • Article Metrics

      • Downloads (Last 12 months)34
      • Downloads (Last 6 weeks)8

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format