Abstract
The mainstream methods for object detection can be divided into two types: one-stage (based on Integrated Convolutional Network) and two-stage (based on Candidate Box Convolutional Network). The one-stage method is fast but not accurate. While the two-stage method is accurate but slow. Thus, this paper proposes a novel convolutional neural network model that can satisfy both efficiency and accuracy needs for real-time object detection. Based on Single Shot Detector (SSD) and Feature Pyramid Networks (FPN), the proposed model addresses the issue of small object detection. The introduction of receptive field block (RFB) and RefineDet network improves the accuracy of the model. The experiment results show that the mAP value of the model exceeds 80%, and the FPS is above 30, when the size of the input image is 320 * 320.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cai, Z., Fan, Q., Feris, R.S., et al.: A unified multi-scale deep convolutional neural network for fast object detection. In: European Conference on Computer Vision, pp. 354–370. Springer, Cham (2016)
Dai, J., Li, Y., He, K., et al.: R-FCN: object detection via region-based fully convolutional networks. In: Neural Information Processing Systems, pp. 379–387 (2016)
Everingham, M., Van Gool, L., Williams, C.K.I., et al.: The PASCAL visual object classes challenge 2007 (VOC2007) results (2007)
Everingham, M., Winn, J., et al.: The PASCAL visual object classes challenge 2012 (voc2012) results (2011). http://www.pascal-network.org/challenges/VOC/voc2011/workshop/index.html.
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
He, K., Zhang, X., Ren, S., et al.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
He, K., Gkioxari, G., Dollár, P., et al.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Huang, G., Liu, Z., Van Der Maaten, L., et al.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Iandola, F., Moskewicz, M., Karayev, S., et al.: DenseNet: implementing efficient ConvNet descriptor pyramids. arXiv preprint arXiv:1404.1869 (2014)
Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Lin, T., Maire, M., Belongie, S., et al.: Microsoft COCO: common objects in context. arXiv preprint arXiv:1405.0312 (2014)
Liu, W., Anguelov, D., Erhan, D., et al.: SSD: single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer, Cham (2016)
Liu, S., Huang, D.: Receptive field block net for accurate and fast object detection. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 385–400 (2018)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271 (2017)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Szegedy, C., Ioffe, S., Vanhoucke, V., et al.: Inception-v4, inception-ResNet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Uijlings, J.R.R., Van De Sande, K.E.A., Gevers, T., et al.: Selective search for object recognition. Int. J. Comput. Vision 104(2), 154–171 (2013)
Wandell, B.A., Winawer, J.: Computational neuroimaging and population receptive fields. Trends Cogn. Sci. 19(6), 349–357 (2015)
Zhang, S., Wen, L., Bian, X., et al.: Single-shot refinement neural network for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4203–4212 (2018)
Acknowledgements
Supported by the Fundamental Research Funds for the Central Universities under Grant Number: N2017003 and N2017004.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Dong, Y., Gao, T. (2021). A Convolutional Neural Network Model for Object Detection Based on Receptive Field. In: Barolli, L., Poniszewska-Maranda, A., Park, H. (eds) Innovative Mobile and Internet Services in Ubiquitous Computing . IMIS 2020. Advances in Intelligent Systems and Computing, vol 1195. Springer, Cham. https://doi.org/10.1007/978-3-030-50399-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-50399-4_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50398-7
Online ISBN: 978-3-030-50399-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)