skip to main content
10.1145/3372278.3390693acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

DAGC: Employing Dual Attention and Graph Convolution for Point Cloud based Place Recognition

Published: 08 June 2020 Publication History

Abstract

Point cloud based retrieval for place recognition remains to be a problem demanding prompt solution due to its difficulty in efficiently encoding local features into adequate global descriptor in scenes. Existing studies solve this problem by generating a global descriptor for each point cloud, which is used to retrieve matched point cloud in database. But existing studies do not make effective use of the relationship between points and neglect different feature's discrimination power. In this paper, we propose to employ Dual Attention and Graph Convolution for point cloud based place recognition (DAGC) to solve these issues. Specifically, we employ two modules to help extract discriminative and generalizable features to describe a point cloud. We introduce a Dual Attention module to help distinguish task-relevant features and to utilize other points' different contributions to a point to generate representation. Meanwhile, we introduce a Residual Graph Convolution Network (ResGCN) module to aggregate local features of each point's multi-level neighbor points to further improve the representation. In this way, we improve the descriptor generation by considering the importance of both point and feature and leveraging point relationship. Experiments conducted on different datasets show that our work outperforms current approaches on all evaluation metrics.

References

[1]
Mikaela Angelina Uy and Gim Hee Lee. 2018. PointNetVLAD: Deep point cloud based retrieval for large-scale place recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4470--4479.
[2]
Relja Arandjelovic, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic. 2016. NetVLAD: CNN architecture for weakly supervised place recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5297--5307.
[3]
Relja Arandjelovic and Andrew Zisserman. 2013. All about VLAD. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 1578--1585.
[4]
Michael M Bronstein and Iasonas Kokkinos. 2010. Scale-invariant heat kernel signatures for non-rigid shape recognition. In2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 1704--1711.
[5]
Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, and Tian Xia. 2017. Multi-view 3dobject detection network for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1907--1915.
[6]
Angela Dai and Matthias Nießner. 2018. 3dmv: Joint 3d-multi-view prediction for 3d semantic scene segmentation. In Proceedings of the European Conference on Computer Vision (ECCV). 452--468.
[7]
Jun Fu, Jing Liu, Haijie Tian, Yong Li, Yongjun Bao, Zhiwei Fang, and Hanqing Lu. 2019. Dual attention network for scene segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3146--3154.
[8]
Dorian Gálvez-López and Juan D Tardos. 2012. Bags of binary words for fast place recognition in image sequences.IEEE Transactions on Robotics28, 5 (2012), 1188--1197.
[9]
Christian Häne, Lionel Heng, Gim Hee Lee, Friedrich Fraundorfer, Paul Furgale, Torsten Sattler, and Marc Pollefeys. 2017. 3D visual perception for self-driving cars using a multi-camera system: Calibration, mapping, localization, and obstacle detection. Image and Vision Computing 68 (2017), 14--27.
[10]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[11]
Wolfgang Hess, Damon Kohler, Holger Rapp, and Daniel Andor. 2016. Real-time loop closure in 2D LIDAR SLAM. In2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 1271--1278.
[12]
Han Hu, Jiayuan Gu, Zheng Zhang, Jifeng Dai, and Yichen Wei. 2018. Relation networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3588--3597.
[13]
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132--7141.
[14]
Qiangui Huang, Weiyue Wang, and Ulrich Neumann. 2018. Recurrent slice networks for 3d segmentation of point clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2626--2635.
[15]
Hervé Jégou, Matthijs Douze, Cordelia Schmid, and Patrick Pérez. 2010. Aggregating local descriptors into a compact image representation. In CVPR 2010--23rd IEEE Conference on Computer Vision & Pattern Recognition. IEEE Computer Society, 3304--3311.
[16]
Andrew E Johnson. 1997. Spin-images: a representation for 3-D surface matching. (1997).
[17]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.
[18]
Ruihui Li, Xianzhi Li, Chi-Wing Fu, Daniel Cohen-Or, and Pheng-Ann Heng. 2019. Pugan: a point cloud upsampling adversarial network. In Proceedings of the IEEE International Conference on Computer Vision. 7203--7212.
[19]
Guosheng Lin, Chunhua Shen, Anton Van Den Hengel, and Ian Reid. 2016. Efficient piecewise training of deep structured models for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3194--3203.
[20]
Haomin Liu, Guofeng Zhang, and Hujun Bao. 2016. Robust keyframe-based monocular SLAM for augmented reality. In 2016 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 1--10.
[21]
Daniel Maturana and Sebastian Scherer. 2015. Voxnet: A 3d convolutional neural network for real-time object recognition. In2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 922--928.
[22]
Aude Oliva and Antonio Torralba. 2006. Building the gist of a scene: The role of global image features in recognition.Progress in brain research155 (2006), 23--36.
[23]
Xiaojiang Peng, Limin Wang, Xingxing Wang, and Yu Qiao. 2016. Bag of visual words and fusion methods for action recognition: Comprehensive study and good practice. Computer Vision and Image Understanding 150 (2016), 109--125.
[24]
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 652--660.
[25]
Charles R Qi, Hao Su, Matthias Nießner, Angela Dai, Mengyuan Yan, and Leonidas J Guibas. 2016. Volumetric and multi-view cnns for object classification on 3d data. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5648--5656.
[26]
Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in neural information processing systems. 5099--5108.
[27]
Radu Bogdan Rusu, Nico Blodow, and Michael Beetz. 2009. Fast point feature histograms (FPFH) for 3D registration. In2009 IEEE International Conference on Robotics and Automation. IEEE, 3212--3217.
[28]
Samuele Salti, Federico Tombari, and Luigi Di Stefano. 2014. SHOT: Unique signatures of histograms for surface and texture description. Computer Vision and Image Understanding 125 (2014), 251--264.
[29]
Jorge Sánchez, Florent Perronnin, Thomas Mensink, and Jakob Verbeek. 2013.Image classification with the fisher vector: Theory and practice. International journal of computer vision 105, 3 (2013), 222--245.
[30]
Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Shirui Pan, and Chengqi Zhang. 2018. Disan: Directional self-attention network for rnn/cnn-free language understanding. In Thirty-Second AAAI Conference on Artificial Intelligence.
[31]
Yiru Shen, Chen Feng, Yaoqing Yang, and Dong Tian. 2018. Mining point cloud local structures by kernel correlation and graph pooling. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4548--4557.
[32]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556(2014).
[33]
Hang Su, Subhransu Maji, Evangelos Kalogerakis, and Erik Learned-Miller. 2015. Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE international conference on computer vision. 945--953.
[34]
Keisuke Tateno, Federico Tombari, Iro Laina, and Nassir Navab. 2017. Cnn-slam:Real-time dense monocular slam with learned depth prediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6243--6252.
[35]
Koen Van De Sande, Theo Gevers, and Cees Snoek. 2009. Evaluating color descriptors for object and scene recognition.IEEE transactions on pattern analysis and machine intelligence 32, 9 (2009), 1582--1596.
[36]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.
[37]
Chu Wang, Babak Samari, and Kaleem Siddiqi. 2018. Local spectral graph convolution for point set feature learning. In Proceedings of the European Conference on Computer Vision (ECCV). 52--66.
[38]
Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, and Xiaoou Tang. 2017. Residual attention network for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3156--3164.
[39]
Weiyue Wang, Ronald Yu, Qiangui Huang, and Ulrich Neumann. 2018. Sgpn: Similarity group proposal network for 3d point cloud instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2569--2578.
[40]
Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E Sarma, Michael M Bronstein, and Justin M Solomon. 2019. Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics (TOG)38, 5 (2019), 146.
[41]
Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1912--1920.
[42]
Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. 2018. Self-attention generative adversarial networks. arXiv preprint arXiv:1805.08318(2018).
[43]
Wenxiao Zhang and Chunxia Xiao. 2019. PCAN: 3D Attention Map Learning Using Contextual Information for Point Cloud Based Retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12436--12445.

Cited By

View all
  • (2025)A Quantum-Inspired Framework in Leader-Servant Mode for Large-Scale Multi-Modal Place RecognitionIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.349757426:2(2027-2039)Online publication date: Feb-2025
  • (2024)SG-LPR: Semantic-Guided LiDAR-Based Place RecognitionElectronics10.3390/electronics1322453213:22(4532)Online publication date: 18-Nov-2024
  • (2024)LiDAR-Based Place Recognition For Autonomous Driving: A SurveyACM Computing Surveys10.1145/370744657:4(1-36)Online publication date: 5-Dec-2024
  • Show More Cited By

Index Terms

  1. DAGC: Employing Dual Attention and Graph Convolution for Point Cloud based Place Recognition

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMR '20: Proceedings of the 2020 International Conference on Multimedia Retrieval
    June 2020
    605 pages
    ISBN:9781450370875
    DOI:10.1145/3372278
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 June 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. attention mechanism
    2. graph convolution networks
    3. place recognition
    4. point cloud based retrieval

    Qualifiers

    • Research-article

    Funding Sources

    • Fundamental Research Funds for Central Universities and Research Funds of Renmin University of China
    • National Natural Science Foundation of China

    Conference

    ICMR '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 254 of 830 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)31
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)A Quantum-Inspired Framework in Leader-Servant Mode for Large-Scale Multi-Modal Place RecognitionIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.349757426:2(2027-2039)Online publication date: Feb-2025
    • (2024)SG-LPR: Semantic-Guided LiDAR-Based Place RecognitionElectronics10.3390/electronics1322453213:22(4532)Online publication date: 18-Nov-2024
    • (2024)LiDAR-Based Place Recognition For Autonomous Driving: A SurveyACM Computing Surveys10.1145/370744657:4(1-36)Online publication date: 5-Dec-2024
    • (2024)Sketch-aided Interactive Fusion Point Cloud Place RecognitionProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657613(1115-1119)Online publication date: 30-May-2024
    • (2024)MVSE-Net: A Multi-View Deep Network With Semantic Embedding for LiDAR Place RecognitionIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.342137525:11(17174-17186)Online publication date: Nov-2024
    • (2024)ComPoint: Can Complex-Valued Representation Benefit Point Cloud Place Recognition?IEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.335121525:7(7494-7507)Online publication date: Jul-2024
    • (2024)CapsLoc3D: Point Cloud Retrieval for Large-Scale Place Recognition Based on 3D Capsule NetworksIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.334695325:7(6811-6823)Online publication date: Jul-2024
    • (2024)APFN: Adaptive Perspective-Based Fusion Network for 3-D Place RecognitionIEEE Transactions on Instrumentation and Measurement10.1109/TIM.2024.341808373(1-10)Online publication date: 2024
    • (2024)3-D LiDAR-Based Place Recognition Techniques: A Review of the Past Ten YearsIEEE Transactions on Instrumentation and Measurement10.1109/TIM.2024.340319473(1-24)Online publication date: 2024
    • (2024)SAGE-Net: Employing Spatial Attention and Geometric Encoding for Point Cloud Based Place RecognitionIEEE Robotics and Automation Letters10.1109/LRA.2024.33871129:6(4958-4965)Online publication date: Jun-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media