ABSTRACT
This paper presents a flexible module that utilizes the 3D position attention mechanism to extract contextual features from local regions of point cloud. The key point is to create an effective representation of local features. Due to the irregularity of point cloud, previous algorithms for point cloud processing have not fully explored how to enhance the extraction of local features. Inspired by the position attention mechanism in the 2D image segmentation algorithm, we propose a Point Attention Graph (PAG) module that can be used to improve the fusion of local features and make it better and faster. The PAG module uses the point attention mechanism to adaptively calculate the interaction between all nodes of the local graph. It can efficiently definite the relations of local points to enhance the performance of feature extraction both in accuracy and time efficiency, especially compared with some related models like PointWEB. Experiments show that our method can be effectively applied to semantic segmentation datasets.
- Table 2: Runtime comparison (4×4096 points of 5 cycles on S3DIS He, Kaiming, "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.Google Scholar
- Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.Google Scholar
- Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).Google Scholar
- Qi, Charles R., "Pointnet: Deep learning on point sets for 3d classification and segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.Google Scholar
- Qi, Charles Ruizhongtai, "Pointnet++: Deep hierarchical feature learning on point sets in a metric space." Advances in neural information processing systems. 2017.Google Scholar
- Jiang, Mingyang, "Pointsift: A sift-like network module for 3d point cloud semantic segmentation." arXiv preprint arXiv:1807.00652 (2018).Google Scholar
- Wang, Yue, "Dynamic graph cnn for learning on point clouds." ACM Transactions on Graphics (TOG) 38.5 (2019): 1-12.Google ScholarDigital Library
- Zhao, Hengshuang, "PointWeb: Enhancing local neighborhood features for point cloud processing." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.Google Scholar
- You, Yurong, "Pseudo-lidar++: Accurate depth for 3d object detection in autonomous driving." arXiv preprint arXiv:1906.06310 (2019).Google Scholar
- Wang, Chen, "Densefusion: 6d object pose estimation by iterative dense fusion." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.Google Scholar
- Qi, Charles R., "Frustum pointnets for 3d object detection from rgb-d data." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.Google Scholar
- Peng, Sida, "Pvnet: Pixel-wise voting network for 6dof pose estimation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.Google Scholar
- Joo, Hanbyul, Tomas Simon, and Yaser Sheikh. "Total capture: A 3d deformation model for tracking faces, hands, and bodies." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.Google Scholar
- Huang, Haibin, "Learning local shape descriptors from part correspondences with multiview convolutional networks." ACM Transactions on Graphics (TOG) 37.1 (2017): 1-14.Google Scholar
- Zhou, Yin, and Oncel Tuzel. "Voxelnet: End-to-end learning for point cloud based 3d object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.Google Scholar
- Liu, Baoyuan, "Sparse convolutional neural networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.Google Scholar
- Li, Yangyan, "Pointcnn: Convolution on x-transformed points." Advances in neural information processing systems. 2018.Google Scholar
- Landrieu, Loic, and Martin Simonovsky. "Large-scale point cloud semantic segmentation with superpoint graphs." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.Google Scholar
- Wu, Wenxuan, Zhongang Qi, and Li Fuxin. "Pointconv: Deep convolutional networks on 3d point clouds." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.Google Scholar
- Fu, Jun, "Dual attention network for scene segmentation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.Google Scholar
- Zhang, Han, "Self-attention generative adversarial networks." arXiv preprint arXiv:1805.08318 (2018).Google Scholar
- Wang, X., He, J., Ma, L.: Exploiting local and global structure for point cloud se- mantic segmentation with contextual point representations. In: Advances in Neural Information Processing Systems. pp. 4573–4583 (2019)Google Scholar
- Armeni, I., Sener, O., Zamir, A.R., Jiang, H., Brilakis, I., Fischer, M., Savarese, S.: 3d semantic parsing of large-scale indoor spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 1534–1543 (2016)Google ScholarCross Ref
- Tchapmi, L., Choy, C., Armeni, I., Gwak, J., Savarese, S.: Segcloud: Semantic segmentation of 3d point clouds. In: 2017 International Conference on 3D Vision (3DV). pp. 537–547. IEEE (2017)Google Scholar
- Wang, S., Suo, S., Ma, W.C., Pokrovsky, A., Urtasun, R.: Deep parametric continuous convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2589–2597 (2018)Google ScholarCross Ref
Recommendations
Point cloud classification network based on self-attention mechanism
AbstractPointNet makes it possible to process point cloud data directly. However, PointNet only extracts global features and cannot capture fine local features. How to build a refined local feature extractor is the main goal of the research. ...
Graphical abstractDisplay Omitted
Point cloud 3D object detection method based on density information-local feature fusion
AbstractNowadays, three-dimensional (3D) point cloud is widely used in unmanned driving, high-precision mapping, robot grasping, mapping and virtual reality (VR) / augmented reality (AR), etc. Especially, many studies have focused on object detection ...
PVFNet: Point-View Fusion Network for 3D Shape Recognition
Knowledge Science, Engineering and ManagementAbstract3D object recognition has enjoyed much of research attention in the machine vision filed. Deep learning methods for 3D shape recognition such as the multi-view based methods and the point cloud based methods have achieved the state-of-the-art ...
Comments