Abstract
The one-stage anchor-based approach has been an efficient and effective approach for detecting objects from massive image data. However, it neglects many distinguishable features of objects, which will lower the accuracy of object detection. In this paper, we propose a new object detection approach that improves existing one-stage anchor-based methods via a Distinguishable Feature Learning Network (DFL-Net). DFL-Net integrates distinguishable features into the learning process to improve the accuracy of object detection. Notably, we implement DFL-Net by a full-scale fusion module and an attention-guided module. In the full-scale fusion module, we first learn the distinguishable features at each scale (layer) and then fuse them in all layers to generate full-scale features. This differs from prior work that only considered one or limited scales and limited features. In the attention-guided module, we extract more distinguishable features based on some positive or negative samples. We conduct extensive experiments on two public datasets, including PASCAL VOC and COCO, to compare the proposed DFL-Net with several one-stage approaches. The results show that DFL-Net achieves a high mAP of 83.1% and outperforms all its competitors. We also compare DFL-Net with three two-stage algorithms, and the results also suggest the superiority of DFL-Net.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS, pp. 379–387 (2016)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: DSSD: deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV, pp. 2980–2988 (2017)
Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Single-shot refinement neural network for object detection. In: CVPR, pp. 4203–4212 (2018)
Li, S., Yang, L., Huang, J., Hua, X.S., Zhang, L.: Dynamic anchor feature selection for single-shot object detection. In: ICCV, pp. 6609–6618 (2019)
Law, H., Deng, J.: Cornernet: detecting objects as paired keypoints. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 765–781. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_45
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)
Kong, T., Sun, F., Liu, H., Jiang, Y., Shi, J.: FoveaBox: beyond anchor-based object detector. arXiv preprint arXiv:1904.03797 (2019)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR, pp. 2117–2125 (2017)
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Everingham, M., Van, L., Williams, C.K., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge 2007 (2007). http://host.robots.ox.ac.uk/pascal/VOC/voc2007
Li, Z., Zhou, F.: FSSD: feature fusion single shot multibox detector. arXiv preprint arXiv:1712.00960 (2017)
Yi, J., Wu, P., Metaxas, D.N.: ASSD: attentive single shot multibox detector. In: CVIU, vol. 189 (2019)
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In CVPR, pp. 6154–6162 (2018)
Duan, K. Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: CenterNet: keypoint triplets for object detection. In: ICCV, pp. 6569–6578 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image detection. In: CVPR, pp. 770–778 (2016)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Zhu Y., Zhao C., Wang J., Zhao X., Wu Y., Lu H.: CoupleNet: coupling global structure with local parts for object detection. In: ICCV, pp. 4126–4134 (2017)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. In: CoRR, abs/1804.02767 (2018)
Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In CVPR, pp. 761–769 (2016)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR, pp. 6517–6525 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image detection. arXiv preprint arXiv:1409.1556 (2014)
Yang, X., Wan, S., Jin, P., Zou, C., Li, X.: MHEF-TripNet: mixed triplet loss with hard example feedback network for image retrieval. In: Zhao, Y., Barnes, N., Chen, B., Westermann, R., Kong, X., Lin, C. (eds.) ICIG 2019. LNCS, vol. 11903, pp. 35–46. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34113-8_4
Sun, Z., Cao, S., Yang, Y., Kris, K.: Rethinking transformer-based set prediction for object detection. arXiv preprint arXiv:2011.10881 (2020)
Tian, Q., Wan, S., Jin, P., Xu, J., Zou, C., Li, X.: A novel feature fusion with self-adaptive weight method based on deep learning for image classification. In: Hong, R., Cheng, W.-H., Yamasaki, T., Wang, M., Ngo, C.-W. (eds.) PCM 2018. LNCS, vol. 11164, pp. 426–436. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00776-8_39
Yang, X., Wan, S., Jin, P.: Domain-invariant region proposal network for cross-domain detection. In: ICME, pp. 1–6 (2020)
Ma, J., Chen, B.: Dual refinement feature pyramid networks for object detection. arXiv preprint arXiv:2012.01733 (2020)
Acknowledgments
This paper is supported by the National Science Foundation of China (grant no. 62072419).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Xie, J., Wan, S., Jin, P. (2021). DFL-Net: Effective Object Detection via Distinguishable Feature Learning. In: Strauss, C., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2021. Lecture Notes in Computer Science(), vol 12924. Springer, Cham. https://doi.org/10.1007/978-3-030-86475-0_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-86475-0_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86474-3
Online ISBN: 978-3-030-86475-0
eBook Packages: Computer ScienceComputer Science (R0)