DFL-Net: Effective Object Detection via Distinguishable Feature Learning

Xie, Jia; Wan, Shouhong; Jin, Peiquan

doi:10.1007/978-3-030-86475-0_20

Jia Xie^12,13,
Shouhong Wan^12,13 &
Peiquan Jin^12,13

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12924))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

774 Accesses
1 Altmetric

Abstract

The one-stage anchor-based approach has been an efficient and effective approach for detecting objects from massive image data. However, it neglects many distinguishable features of objects, which will lower the accuracy of object detection. In this paper, we propose a new object detection approach that improves existing one-stage anchor-based methods via a Distinguishable Feature Learning Network (DFL-Net). DFL-Net integrates distinguishable features into the learning process to improve the accuracy of object detection. Notably, we implement DFL-Net by a full-scale fusion module and an attention-guided module. In the full-scale fusion module, we first learn the distinguishable features at each scale (layer) and then fuse them in all layers to generate full-scale features. This differs from prior work that only considered one or limited scales and limited features. In the attention-guided module, we extract more distinguishable features based on some positive or negative samples. We conduct extensive experiments on two public datasets, including PASCAL VOC and COCO, to compare the proposed DFL-Net with several one-stage approaches. The results show that DFL-Net achieves a high mAP of 83.1% and outperforms all its competitors. We also compare DFL-Net with three two-stage algorithms, and the results also suggest the superiority of DFL-Net.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS, pp. 91–99 (2015)
Google Scholar
Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: NIPS, pp. 379–387 (2016)
Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: DSSD: deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: ICCV, pp. 2980–2988 (2017)
Google Scholar
Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Single-shot refinement neural network for object detection. In: CVPR, pp. 4203–4212 (2018)
Google Scholar
Li, S., Yang, L., Huang, J., Hua, X.S., Zhang, L.: Dynamic anchor feature selection for single-shot object detection. In: ICCV, pp. 6609–6618 (2019)
Google Scholar
Law, H., Deng, J.: Cornernet: detecting objects as paired keypoints. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 765–781. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_45
Chapter Google Scholar
Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)
Kong, T., Sun, F., Liu, H., Jiang, Y., Shi, J.: FoveaBox: beyond anchor-based object detector. arXiv preprint arXiv:1904.03797 (2019)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR, pp. 2117–2125 (2017)
Google Scholar
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Chapter Google Scholar
Everingham, M., Van, L., Williams, C.K., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge 2007 (2007). http://host.robots.ox.ac.uk/pascal/VOC/voc2007
Li, Z., Zhou, F.: FSSD: feature fusion single shot multibox detector. arXiv preprint arXiv:1712.00960 (2017)
Yi, J., Wu, P., Metaxas, D.N.: ASSD: attentive single shot multibox detector. In: CVIU, vol. 189 (2019)
Google Scholar
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In CVPR, pp. 6154–6162 (2018)
Google Scholar
Duan, K. Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: CenterNet: keypoint triplets for object detection. In: ICCV, pp. 6569–6578 (2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image detection. In: CVPR, pp. 770–778 (2016)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Zhu Y., Zhao C., Wang J., Zhao X., Wu Y., Lu H.: CoupleNet: coupling global structure with local parts for object detection. In: ICCV, pp. 4126–4134 (2017)
Google Scholar
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. In: CoRR, abs/1804.02767 (2018)
Google Scholar
Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In CVPR, pp. 761–769 (2016)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)
Google Scholar
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: CVPR, pp. 6517–6525 (2017)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image detection. arXiv preprint arXiv:1409.1556 (2014)
Yang, X., Wan, S., Jin, P., Zou, C., Li, X.: MHEF-TripNet: mixed triplet loss with hard example feedback network for image retrieval. In: Zhao, Y., Barnes, N., Chen, B., Westermann, R., Kong, X., Lin, C. (eds.) ICIG 2019. LNCS, vol. 11903, pp. 35–46. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-34113-8_4
Chapter Google Scholar
Sun, Z., Cao, S., Yang, Y., Kris, K.: Rethinking transformer-based set prediction for object detection. arXiv preprint arXiv:2011.10881 (2020)
Tian, Q., Wan, S., Jin, P., Xu, J., Zou, C., Li, X.: A novel feature fusion with self-adaptive weight method based on deep learning for image classification. In: Hong, R., Cheng, W.-H., Yamasaki, T., Wang, M., Ngo, C.-W. (eds.) PCM 2018. LNCS, vol. 11164, pp. 426–436. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00776-8_39
Chapter Google Scholar
Yang, X., Wan, S., Jin, P.: Domain-invariant region proposal network for cross-domain detection. In: ICME, pp. 1–6 (2020)
Google Scholar
Ma, J., Chen, B.: Dual refinement feature pyramid networks for object detection. arXiv preprint arXiv:2012.01733 (2020)

Download references

Acknowledgments

This paper is supported by the National Science Foundation of China (grant no. 62072419).

Author information

Authors and Affiliations

University of Science and Technology of China, Hefei, China
Jia Xie, Shouhong Wan & Peiquan Jin
Key Laboratory of Electromagnetic Space Information, CAS, Hefei, China
Jia Xie, Shouhong Wan & Peiquan Jin

Authors

Jia Xie
View author publications
You can also search for this author in PubMed Google Scholar
Shouhong Wan
View author publications
You can also search for this author in PubMed Google Scholar
Peiquan Jin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Peiquan Jin .

Editor information

Editors and Affiliations

University of Vienna, Vienna, Austria
Christine Strauss
Johannes Kepler University Linz, Linz, Oberösterreich, Austria
Gabriele Kotsis
Vienna University of Technology, Vienna, Austria
A Min Tjoa
Johannes Kepler University of Linz, Linz, Austria
Ismail Khalil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xie, J., Wan, S., Jin, P. (2021). DFL-Net: Effective Object Detection via Distinguishable Feature Learning. In: Strauss, C., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2021. Lecture Notes in Computer Science(), vol 12924. Springer, Cham. https://doi.org/10.1007/978-3-030-86475-0_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-86475-0_20
Published: 01 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86474-3
Online ISBN: 978-3-030-86475-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics