Skip to main content
Log in

ANMS: attention-based non-maximum suppression

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Non-Maximum Suppression (NMS) is an essential part of the object detection pipeline. However, due to the inconsistency between the classification confidence and the object localization, NMS may mistakenly eliminate the bounding boxes with low classification confidence and high localization accuracy. In this paper, we propose an attention-based non-maximum suppression (ANMS) algorithm. It reconstructs the attention map to obtain the object location information by backpropagating the top-level object classification semantic information. Furthermore, integrating the classification confidence and the attention map of the detection bounding boxes adjust the inconsistency between the classification confidence and the object localization. On the PASCAL VOC2007 and the PASCAL VOC2012 datasets, the proposed ANMS algorithm achieved 1.85 and 1.24 performance improvement over the NMS algorithm. On the MS COCO datasets, the proposed ANMS algorithm achieved 0.3 performance improvement, which proved the ANMS algorithm’s effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Bodla N, Singh B, Chellappa R et al (2017) Soft-NMS: Improving Object Detection With One Line of Code. Proceedings of the IEEE international conference on computer vision, pp 5561–5569

  2. Cao C, Liu X, Yang Y et al (2015) Look and think twice: Capturing top-down visual attention with feedback convolutional neural networks. Proceedings of the IEEE International Conference on Computer Vision, pp 2956–2964

  3. Chen Y, Hong WC, Shen W, Huang N (2016) Electric load forecasting based on a least squares support vector machine with fuzzy time series and global harmony search algorithm, vol 9

  4. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: International conference on computer vision & pattern recognition (CVPR’05). IEEE computer society, pp 886–893

  5. Everingham M, Winn J (2011) The pascal visual object classes challenge 2012 (voc2012) development kit. Pattern Analysis, Statistical Modelling and Computational Learning, Tech. Rep

  6. Everingham M, Van Gool L, Williams CKI et al (2007) The PASCAL visual object classes challenge 2007 (VOC2007) results

  7. Fan GF, Qing S, Wang H, Hong WC, Li HJ (2013) Support vector regression model based on empirical mode decomposition and auto regression for electric load forecasting, vol 6

  8. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans Pattern Anal Mach Intell 32:1627–1645

    Article  Google Scholar 

  9. He Y, Ma X, Luo X et al (2017) Vehicle traffic driven camera placement for better metropolis security surveillance. arXiv:1705.08508

  10. He Y, Zhang X, Savvides M et al (2018) Softer-nms: rethinking bounding box regression for accurate object detection. arXiv:1809.08545

  11. Hosang J, Benenson R, Schiele B (2017) Learning non-maximum suppression. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4507–4515

  12. Jiang B, Luo R, Mao J et al (2018) Acquisition of localization confidence for accurate object detection. Proceedings of the European Conference on Computer Vision (ECCV), pp 784–799

  13. Li MW, Wang YT, Geng J, Hong WC (2021) Chaos cloud quantum bat hybrid optimization algorithm. Nonlinear Dyn. 103(1):1167–1193

    Article  Google Scholar 

  14. Liang X, Wang T, Yang L et al (2018) Cirl: Controllable imitative reinforcement learning for vision-based self-driving. Proceedings of the European conference on computer vision (ECCV), pp 584–599

  15. Lin TY, Maire M, Belongie S et al (2014) Microsoft coco: Common objects in context. European conference on computer vision. Springer, Cham, pp 740–755

  16. Liu S, Huang D, Wang Y (2019) Adaptive NMS: refining pedestrian detection in a crowd CVPR

  17. Neubeck A, Van Gool L (2006) Efficient non-maximum suppression. 18th International Conference on pattern recognition (ICPR’06). IEEE, pp 850–855

  18. Ning C, Zhou H, Song Y, Tang J (2017) Inception single shot MultiBox detector for object detection. In: 2017 IEEE International conference on multimedia expo workshops (ICMEW), pp 549–554

  19. Philbin J, Chum O, Isard M et al (2007) Object retrieval with large vocabularies and fast spatial matching. 2007 IEEE conference on computer vision and pattern recognition. IEEE Computer Society, pp 1–8

  20. Ren S, He K, Girshick R et al (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, pp 91–99

  21. Selvaraju RR, Cogswell M, Das A et al (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE International Confere-nce on Computer Vision, pp 618–626

  22. Shrivastava A, Gupta A (2016) Contextual priming and feedback for faster r-cnn. European Conference on Computer Vision. Springer, Cham, pp 330–348

  23. Taigman Y, Yang M, Ranzato MA et al (2014) Deepface: closing the gap to human-level performance in face verification. Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1701–1708

  24. Wang Q, Zhang J, Song S et al (2014) Attentional neural network: Feature selection using cognitive feedback. Advances in Neural Information Processing Systems, pp 2033–2041

  25. Zhang J, Bargal SA, Lin Z et al (2018) Top-down neural attention by excitation backprop. Int J Comput Vis 126(10):1084–1102

    Article  Google Scholar 

  26. Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2020) Distance-IoU loss: faster and better learning for bounding box regression AAAI

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunsheng Guo.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guo, C., Cai, M., Ying, N. et al. ANMS: attention-based non-maximum suppression. Multimed Tools Appl 81, 11205–11219 (2022). https://doi.org/10.1007/s11042-022-12142-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12142-5

Keywords

Navigation