research-article

Hard Anchor Attention in Anchor-based Detector

Authors:

Xiao SunAuthors Info & Claims

ICMLC '22: Proceedings of the 2022 14th International Conference on Machine Learning and Computing

Pages 331 - 336

https://doi.org/10.1145/3529836.3529940

Published: 21 June 2022 Publication History

Abstract

In the anchor-based object detector, the redundancy introduced by the symmetry of anchor generator will be harmful for the diversity of positive anchors and cause performance drop. A simple yet effective sampling strategy called Hard Anchor Attention (HAA) is proposed in this paper. First, the anchor generator is re-examined by studying the contribution of different samples to the overall performance. It is verified that the harder positive anchors play an important role in the training of the detector. Then the HAA is introduced to evaluate the difficulty of refining anchors, and direct the focus of the training process to such harder anchors. The experimental results demonstrate that HAA can bring performance gains to RetinaNet and further releases the subsequent branches. Particularly, without fine-tuning, on the Pascal VOC dataset, HAA outperforms the random sampling and all-in baseline.

References

[1]

Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. 2020. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934(2020).

[2]

Yuhang Cao, Kai Chen, Chen Change Loy, and Dahua Lin. 2020. Prime sample attention in object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11583–11591.

[3]

Shohei Chiba and Hisayuki Sasaoka. 2021. Effectiveness of Transfer Learning in Autonomous Driving using Model Car. In 2021 13th International Conference on Machine Learning and Computing. 595–601.

Digital Library

[4]

Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. 2010. The pascal visual object classes (voc) challenge. International journal of computer vision 88, 2 (2010), 303–338.

Digital Library

[5]

Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 580–587.

Digital Library

[6]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.

[7]

Lichao Huang, Yi Yang, Yafeng Deng, and Yinan Yu. 2015. Densebox: Unifying landmark localization with end to end object detection. arXiv preprint arXiv:1509.04874(2015).

[8]

Borui Jiang, Ruixuan Luo, Jiayuan Mao, Tete Xiao, and Yuning Jiang. 2018. Acquisition of localization confidence for accurate object detection. In Proceedings of the European conference on computer vision (ECCV). 784–799.

Digital Library

[9]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012), 1097–1105.

[10]

Hei Law and Jia Deng. 2018. Cornernet: Detecting objects as paired keypoints. In Proceedings of the European conference on computer vision (ECCV). 734–750.

Digital Library

[11]

Jiachen Li, Bowen Cheng, Rogerio Feris, Jinjun Xiong, Thomas S Huang, Wen-Mei Hwu, and Humphrey Shi. 2021. Pseudo-IoU: Improving Label Assignment in Anchor-Free Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2378–2387.

[12]

Xiang Li, Wenhai Wang, Xiaolin Hu, Jun Li, Jinhui Tang, and Jian Yang. 2021. Generalized focal loss v2: Learning reliable localization quality estimation for dense object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11632–11641.

[13]

Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2117–2125.

[14]

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980–2988.

[15]

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.

[16]

Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia. 2018. Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8759–8768.

[17]

Joseph Redmon and Ali Farhadi. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767(2018).

[18]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015), 91–99.

[19]

Hamid Rezatofighi, Nathan Tsoi, JunYoung Gwak, Amir Sadeghian, Ian Reid, and Silvio Savarese. 2019. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 658–666.

[20]

Abhinav Shrivastava, Abhinav Gupta, and Ross Girshick. 2016. Training region-based object detectors with online hard example mining. In Proceedings of the IEEE conference on computer vision and pattern recognition. 761–769.

[21]

Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.1556

[22]

Bharat Singh and Larry S Davis. 2018. An analysis of scale invariance in object detection snip. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3578–3587.

[23]

Bharat Singh, Mahyar Najibi, and Larry S. Davis. 2018. SNIPER: Efficient Multi-Scale Training. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (Eds.). 9333–9343. https://proceedings.neurips.cc/paper/2018/hash/166cee72e93a992007a89b39eb29628b-Abstract.html

[24]

Peize Sun, Rufeng Zhang, Yi Jiang, Tao Kong, Chenfeng Xu, Wei Zhan, Masayoshi Tomizuka, Lei Li, Zehuan Yuan, Changhu Wang, 2021. Sparse r-cnn: End-to-end object detection with learnable proposals. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14454–14463.

[25]

Zhi Tian, Chunhua Shen, Hao Chen, and Tong He. 2019. Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF international conference on computer vision. 9627–9636.

[26]

Yuan-Kai Wang, Ching-Tang Fan, Ke-Yu Cheng, and Peter Shaohua Deng. 2011. Real-time camera anomaly detection for real-world video surveillance. In 2011 International Conference on Machine Learning and Cybernetics, Vol. 4. IEEE, 1520–1525.

[27]

Shifeng Zhang, Cheng Chi, Yongqiang Yao, Zhen Lei, and Stan Z Li. 2020. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9759–9768.

[28]

Xingyi Zhou, Vladlen Koltun, and Philipp Krähenbühl. 2020. Tracking Objects as Points. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part IV(Lecture Notes in Computer Science, Vol. 12349), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer, 474–490. https://doi.org/10.1007/978-3-030-58548-8_28

Digital Library

Cited By

Yin YKou TOchieng NBao X(2023)PASFLN: Positional Association and Semantic Fusion Learning Network for Traffic Object Detection2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC)10.1109/ITSC57777.2023.10422508(329-334)Online publication date: 24-Sep-2023
https://doi.org/10.1109/ITSC57777.2023.10422508

Recommendations

Anchor-few: an adaptive precise indoor positioning system for low anchor densities based on IoT localization
MobiCom '22: Proceedings of the 28th Annual International Conference on Mobile Computing And Networking

This paper designs and implements an adaptive precise indoor positioning system, called Anchor-Few, for low anchor densities through Internet of Things (IoT) localization. Anchor Few exploits the IoT localization device, iBeacon, to provide accurate ...
UWB-based Single-anchor Low-cost Indoor Localization System
SenSys '17: Proceedings of the 15th ACM Conference on Embedded Network Sensor Systems

In this demo, we present a low-cost indoor localization system based on the off-the-shelf ultra-wideband transceiver Decawave DW1000. To obtain an accurate position information, the system makes use of a single anchor and of multipath reflections from ...
A New Anchor-Based Localization Algorithm for Wireless Sensor Network
DCABES '11: Proceedings of the 2011 10th International Symposium on Distributed Computing and Applications to Business, Engineering and Science

Anchor plays a very important role in range-based localization algorithm. In order to solve disadvantages of traditional localization algorithm based on anchors, this paper proposes a new anchor-based localization algorithm for wireless sensor network. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICMLC '22: Proceedings of the 2022 14th International Conference on Machine Learning and Computing

February 2022

570 pages

ISBN:9781450395700

DOI:10.1145/3529836

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICMLC 2022

ICMLC 2022: 2022 14th International Conference on Machine Learning and Computing

February 18 - 21, 2022

Guangzhou, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
27
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yin YKou TOchieng NBao X(2023)PASFLN: Positional Association and Semantic Fusion Learning Network for Traffic Object Detection2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC)10.1109/ITSC57777.2023.10422508(329-334)Online publication date: 24-Sep-2023
https://doi.org/10.1109/ITSC57777.2023.10422508

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten