Skip to main content
Log in

DB-NMS: improving non-maximum suppression with density-based clustering

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Non-maximum suppression (NMS) is a post-processing step in most object detection pipelines. It is a greedy algorithm based on the Intersection over Union (IoU) of the bounding boxes to reduce false positives by removing excessive repeated bounding boxes, yet the geometric distributions of bounding boxes are not fully utilized. It is found that the distributions of bounding boxes’ center points correspond to the distributions of objects. Local areas with clustered distributions of center points exist objects. Local areas with sparse distributions of center points are considered as the false positives or noises of the detector. In this work, a density-based NMS (DB-NMS) is proposed, which is based on the density distributions of the bounding boxes’ center points to evaluate the importance of difference anchors. The proposed DB-NMS is able to obtain better results than the original NMS on the MS-COCO 2017 dataset. Because DB-NMS does not change the network structure, it can be easily integrated into the object detection pipelines to achieve better performances. Object detection pipelines such as Faster R-CNN and RetinaNet can be integrated with the proposed DB-NMS with little degradation on the computational efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Bojarski M, Del Testa D, Dworakowski D et al K (2016) End to end learning for self-driving cars. arXiv preprint https://arxiv.org/abs/1604.07316

  2. Jiang H, Learned-Miller E (2017) Face detection with the faster R-CNN. In: 2017 12th IEEE international conference on automatic face & gesture recognition, pp 650–657

  3. Cheng G, Han J (2016) A survey on object detection in optical remote sensing images. ISPRS J Photogram Remote Sens 117:11–28

    Article  Google Scholar 

  4. Panwar H, Gupta PK, Siddiqui MK et al (2020) AquaVision: automating the detection of waste in water bodies using deep transfer learning. Case Stud Chem Environ Eng 2:100026

    Article  Google Scholar 

  5. Siddiqui MK, Islam MZ, Kabir MA (2019) A novel quick seizure detection and localization through brain data mining on ECoG dataset. Neural Comput Appl 31(9):5595–5608

    Article  Google Scholar 

  6. Siddiqui MK, Huang X, Morales-Menendez R et al (2020) Machine learning based novel cost-sensitive seizure detection classifier for imbalanced EEG data sets. Int J Interact Design Manuf (IJIDeM) 14(4):1491–1509

    Article  Google Scholar 

  7. Dollár P, Appel R, Belongie S, Perona P (2014) Fast feature pyramids for object detection. IEEE Trans Pattern Anal Mach Intell 36(8):1532–1545

    Article  Google Scholar 

  8. Dollár, Piotr et al (2009) Integral channel features. In: Proceedings of the British machine vision conference. BMVC Press, London, 91.1-91.11. ISBN 1-901725-39-1

  9. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105

    Google Scholar 

  10. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv preprint https://arxiv.org/abs/1506.01497

  11. Lin T Y, Goyal P, Girshick R et al P (2017) Focal loss for dense object detection. In: Proceedings of the IEEE international conference on computer vision, pp 2980–2988

  12. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496

    Article  Google Scholar 

  13. Liu S, Huang D, Wang Y (2019) Adaptive nms: refining pedestrian detection in a crowd. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6459–6468

  14. Lin T Y, Maire M, Belongie S et al C. L (2014, September) Microsoft coco: Common objects in context. In European conference on computer vision, pp 740–755

  15. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587

  16. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916

    Article  Google Scholar 

  17. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448

  18. Cai Z, Vasconcelos N (2018) Cascade r-cnn: delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6154–6162

  19. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788

  20. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271

  21. Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint https://arxiv.org/abs/1804.02767

  22. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y, Berg A C (2016) Ssd: single shot multibox detector. In: European conference on computer vision, pp 21–37

  23. Law H, Deng J (2018) Cornernet: detecting objects as paired keypoints. In: Proceedings of the European conference on computer vision (ECCV), pp 734–750

  24. Iandola, FN, Han S et al (2016) SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint https://arxiv.org/abs/1602.07360

  25. Höppner F, Klawonn F, Kruse R, Runkler T (1999) Fuzzy cluster analysis: methods for classification, data analysis and image recognition. Wiley, New Yrok

    MATH  Google Scholar 

  26. Borgelt C (2006) Prototype-based classification and clustering. https://borgelt.net/habil/pbcc.pdf

  27. Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254

    Article  Google Scholar 

  28. Ankerst M, Breunig MM, Kriegel HP, Sander J (1999) OPTICS: Ordering points to identify the clustering structure. ACM SIGMOD Rec 28(2):49–60

    Article  Google Scholar 

  29. Birant D, Kut A (2007) ST-DBSCAN: An algorithm for clustering spatial–temporal data. Data Knowl Eng 60(1):208–221

    Article  Google Scholar 

  30. Bodla N, Singh B, Chellappa R, Davis L S (2017) Soft-NMS--improving object detection with one line of code. In: Proceedings of the IEEE international conference on computer vision, pp 5561–5569

  31. He Y, Zhang X, Savvides M, Kitani K (2018) Softer-nms: rethinking bounding box regression for accurate object detection. arXiv preprint https://arxiv.org/abs/1809.08545

  32. Hosang J, Benenson R, Schiele B (2016) A convnet for non-maximum suppression. In: German conference on pattern recognition, pp 192–204

  33. Hosang J, Benenson R, Schiele B (2017) Learning non-maximum suppression. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4507–4515

  34. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  35. Redmon J (2013) Darknet: Open source neural networks in c. http://pjreddie.com/darknet

Download references

Acknowledgements

This work was partially supported by the Fundamental Research Funds for the Central Universities (No. 2232021D-37), Natural Science Foundation of Shanghai (No. 21ZR1401700) and National Natural Science Foundation of China (No. 62176052).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xue-song Tang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rui, L., Tang, Xs. & Hao, K. DB-NMS: improving non-maximum suppression with density-based clustering. Neural Comput & Applic 34, 4747–4757 (2022). https://doi.org/10.1007/s00521-021-06628-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-021-06628-w

Keywords