skip to main content
10.1145/3373509.3373528acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccprConference Proceedingsconference-collections
research-article

Object Detection Based on Feature Scale Fusion and Feature Scale Enhancement

Published: 25 March 2020 Publication History

Abstract

Currently, object detection is widely used to deal with the image analysis problem, which is the most important task in computer version. As the high image resolution leads to high computational cost and the low image resolution leads to low accuracy, the key bottleneck of CNN based object detection is the image resolution selection. In this work, we solve the problem of obtaining powerful object detection effect with feature scale fusion and feature scale enhancement. The proposed module can achieve significant accuracy by applying Feature Scale Fusion Module (FSFM) and Feature Scale Enhancement Module (FSEM) to the feature extraction layer of Faster R-CNN. In this case, the enhanced feature map become the input of RPN layer and ROI layer of Faster R-CNN. The accuracy gain of the propose module is verified via the Pascal VOC and MSCOCO datasets, which is proved to obtain significant improvements over the most advanced detection models.

References

[1]
G. E. Hinton and R. R. Salakhutdinov. Reducing the Dimensionality of Data with Neural Networks. Science. 2006 Jul 28;313(5786):504--7.
[2]
Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012:1097--1105
[3]
Uijlings, Jasper RR, et al. "Selective search for object recognition." International journal of computer vision, 104(2) (2013): 154--171.
[4]
Ross Girshick Jeff Donahue Trevor Darrell Jitendra Malik UC Berkeley. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. Computer Science, 2014: 580--587.
[5]
T. Lindeberg. Scale-space theory in computer vision, 1993.
[6]
P. Perona and J. Malik. Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on pattern analysis and machine intelligence, 12(7):629--639, 1990.
[7]
A. Witkin. Scale-space filtering: A new approach to multiscale description. In Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP'84., volume 9, pages 150--153. IEEE, 1984.
[8]
B. Singh and L. S. Davis. An analysis of scale invariance in object detection-SNIP. In CVPR, 2018.
[9]
T.-Y. Lin, P. Doll´ar, R. Girshick, K. He, B. Hariharan, and S. Belongie. Feature pyramid networks for object detection. arXiv preprint arXiv:1612.03144, 2016.
[10]
Ren S, He K, Girshick R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2016: 1--1.
[11]
Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. Computer Science, 2014.
[12]
S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91--99, 2015.
[13]
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.Y. Fu, and A. C. Berg. Ssd: Single shot multibox detector. In European conference on computer vision, pages 21--37. Springer, 2016
[14]
T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and S. Belongie. Feature pyramid networks for object detection. CVPR, 2016
[15]
W. Shi, J. Caballero, F. Husz´ar, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1874--1883, 2016.
[16]
A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, 2012.
[17]
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In CoRR, 2014.
[18]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In CVPR, 2015.
[19]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In CVPR, 2016.
[20]
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014.
[21]
Peng Zhou, Bingbing Ni, Cong Geng, Jianguo Hu, Yi Xu. Scale-Transferrable Object Detection. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 528--537.
[22]
Girshick R. Fast R-CNN. Computer Science, 2015.
[23]
J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In CVPR, 2015.
[24]
B. Hariharan, P. Arbelaez, R. Girshick, and J. Malik. Hyper- ´ columns for object segmentation and fine-grained localization. In CVPR, 2015.
[25]
T. Kong, A. Yao, Y. Chen, and F. Sun. Hypernet: Towards accurate region proposal generation and joint object detection. In CVPR, 2016.
[26]
W. Liu, A. Rabinovich, and A. C. Berg. ParseNet: Looking wider to see better. In ICLR workshop, 2016.
[27]
Z. Cai, Q. Fan, R. S. Feris, and N. Vasconcelos. A unified multi-scale deep convolutional neural network for fast object detection. In ECCV, 2016
[28]
Bar M. Visual objects in context. Nature Reviews Neuroscience (2004)5(8):617--629
[29]
Chen L., Papandreou G., Kokkinos I., Murphy K., Yuille A. Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: ICLR (2015)
[30]
Chen L., Papandreou G., Kokkinos I., Murphy K., Yuille A. DeepLab: Semantic image segmentation with deep convolutional nets atrous convolution, and fully connected CRFs. TPAMI (2018)
[31]
Hu J, Shen L, Albanie S, et al. Squeeze-and-Excitation Networks[J] (2017).

Index Terms

  1. Object Detection Based on Feature Scale Fusion and Feature Scale Enhancement

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICCPR '19: Proceedings of the 2019 8th International Conference on Computing and Pattern Recognition
    October 2019
    522 pages
    ISBN:9781450376570
    DOI:10.1145/3373509
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Hebei University of Technology
    • Beijing University of Technology

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 March 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Object detection
    2. faster R-CNN
    3. feature scale enhancement
    4. feature scale fusion

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICCPR '19

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 77
      Total Downloads
    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media