Skip to main content

Three-Dimensional Object Detection Network Based on Coordinate Attention and Overlapping Region Penalty Mechanisms

  • Conference paper
  • First Online:
Pattern Recognition (ACPR 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14408))

Included in the following conference series:

  • 256 Accesses

Abstract

Three-dimensional target detection is a key technology in the fields of autonomous driving and robot control for applications such as self-driving cars and unmanned aircraft systems. In order to achieve high detection accuracy, this paper proposes a 3D target detection network with a coordinate attention training mechanism that generates voting feature points for better detection ability and an overlap region penalty mechanism that reduces false detection. In comparative experiments on public large-scale 3D datasets including the Scannet dataset and SUN-RGB-D dataset, the proposed method obtained an average detection accuracy mAP of 60.1% and 58.0% with an intersection ratio of 0.25, which demonstrates its superior effectiveness over the current main algorithms such as F-PointNet, VoxelNet and MV3D. The improved method is expected to achieve higher accuracy for 3D object detection relying only on point cloud information.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ss, A., Svavp, B.: Techniques and challenges of face recognition: a critical review. Procedia Comput. Sci. 143, 536–543 (2018)

    Article  Google Scholar 

  2. Yu, H., Yang, Z., Tan, L., et al.: Methods and datasets on semantic segmentation: a review. Neurocomputing 304, 82–103 (2018)

    Google Scholar 

  3. Shi, S., Wang, X., Li, H.: Pointrcnn: 3d object proposal generation and detection from point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 770–779 (2019)

    Google Scholar 

  4. Qi, C.R., Yi, L., Su, H., et al.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, 30 (2017)

    Google Scholar 

  5. Qi, C.R., Litany, O., He, K., et al.: Deep hough voting for 3d object detection in point clouds. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9277–9286 (2019)

    Google Scholar 

  6. Zhou, Y., Tuzel, O.: Voxelnet: end-to-end learning for point cloud based 3d object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4490–4499 (2018)

    Google Scholar 

  7. Yan, Y., Mao, Y., Li, B.: Second: sparsely embedded convolutional detection. Sensors 18(10), 3337 (2018)

    Article  Google Scholar 

  8. Lang, A.H., Vora, S., Caesar, H., et al.: Pointpillars: fast encoders for object detection from point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12697–12705 (2019)

    Google Scholar 

  9. Shi, S., Guo, C., Jiang, L., et al.: Pv-rcnn: point-voxel feature set abstraction for 3d object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10529–10538 (2020)

    Google Scholar 

  10. Chen, X., Ma, H., Wan, J., et al.: Multi-view 3d object detection network for autonomous driving. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1907–1915 (2017)

    Google Scholar 

  11. Qi, C.R., Liu, W., Wu, C., et al.: Frustum pointnets for 3d object detection from rgb-d data. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 918–927 (2018)

    Google Scholar 

  12. Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)

    Google Scholar 

  13. Dai, A., Chang, A.X., Savva, M., et al.: Scannet: Richly-annotated 3d reconstructions of indoor scenes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5828–5839 (2017)

    Google Scholar 

  14. Song, S., Lichtenberg, S.P., Xiao, J.: Sun rgb-d: A rgb-d scene understanding benchmark suite. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 567–576 (2015)

    Google Scholar 

  15. Song, S., Xiao, J.: Deep sliding shapes for amodal 3d object detection in rgb-d images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 808–816 (2016)

    Google Scholar 

  16. Hou, J., Dai, A., Nießner, M.: 3d-sis: 3d semantic instance segmentation of rgb-d scans. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4421–4430 (2019)

    Google Scholar 

  17. He, K., Gkioxari, G., Dollár, P., et al.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

    Google Scholar 

  18. Yi, L., Zhao, W., Wang, H., et al.: Gspn: generative shape proposal network for 3d instance segmentation in point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3947–3956

    Google Scholar 

Download references

Acknowledgments

The research was supported by the Zhejiang Provincial Natural Science Foundation of China (Grant No. LQ21A040007), and Scientific Research Fund of Zhejiang Provincial Education Department (Grant No. Y201941856).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaoqin Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, W., Zhu, S., Liu, H., Zhang, P., Zhang, X. (2023). Three-Dimensional Object Detection Network Based on Coordinate Attention and Overlapping Region Penalty Mechanisms. In: Lu, H., Blumenstein, M., Cho, SB., Liu, CL., Yagi, Y., Kamiya, T. (eds) Pattern Recognition. ACPR 2023. Lecture Notes in Computer Science, vol 14408. Springer, Cham. https://doi.org/10.1007/978-3-031-47665-5_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-47665-5_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-47664-8

  • Online ISBN: 978-3-031-47665-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics