Abstract
View-based 3D model classification and retrieval are increasingly important in various fields. High classification accuracy and retrieval precision are urgently needed in the related applications. However, these two topics are always considered separately and very few works give an in-depth analysis of their relations. We would like to argue that although the classification and retrieval focus on different characteristics of embedding features, they are compatible rather than opposed to each other. Inspired by the recent deep metric learning approaches in this field, we propose a novelty loss named center-push loss for joint feature learning. The proposed loss can drive the convolutional neural network to learn object features that distributed compact in intra-class while separated in inter-class effectively. It avoids the annoying triplet sampling operation which always needs a delicately designed sampling strategy for efficient network optimization. The new loss is simple in structure and fast in training and achieves the best performance in classification and retrieval compared with many states of the arts.
Similar content being viewed by others
References
Bai, S., Bai, X., Zhou, Z., Zhang, Z., Tian, Q., Latecki, L.J.: GIFT: towards scalable 3D shape retrieval. IEEE Trans. Multimed. 19(6), 1257–1271 (2017)
Chen, S., Zheng, L., Zhang, Y., Sun, Z., Xu, K.: Veram: view-enhanced recurrent attention model for 3D shape classification. IEEE Trans. Vis. Comput. Graph. 25, 3244–3257 (2018)
Feng, Y., You, H., Zhang, Z., Ji, R., Gao, Y.: Hypergraph neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 3558–3565 (2019)
Giunchi, D., James, S., Steed, A.: 3D sketching for interactive model retrieval in virtual reality. In: Proceedings of the Joint Symposium on Computational Aesthetics and Sketch-Based Interfaces and Modeling and Non-photorealistic Animation and Rendering, p. 1. ACM (2018)
Gomez-Donoso, F., Garcia-Garcia, A., Garcia-Rodriguez, J., Orts-Escolano, S., Cazorla, M.: LonchaNet: a sliced-based CNN architecture for real-time 3D object recognition. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 412–418. IEEE (2017)
Grabner, A., Roth, P.M., Lepetit, V.: 3D pose estimation and 3D model retrieval for objects in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3022–3031 (2018)
Guo, H., Wang, J., Gao, Y., Li, J., Lu, H.: Multi-view 3D object retrieval with deep embedding network. IEEE Trans. Image Process. 25(12), 5526–5537 (2016)
Han, Z., Shang, M., Liu, Z., Vong, C.M., Liu, Y.S., Zwicker, M., Han, J., Chen, C.P.: SeqViews2SeqLabels: learning 3D global features via aggregating sequential views by RNN with attention. IEEE Trans. Image Process. 28(2), 658–672 (2019)
He, X., Zhou, Y., Zhou, Z., Bai, S., Bai, X.: Triplet-center loss for multi-view 3D object retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1945–1954 (2018)
Jayanti, S., Kalyanaraman, Y., Iyer, N., Ramani, K.: Developing an engineering shape benchmark for CAD models. Comput. Aided Des. 38(9), 939–953 (2006)
Kumawat, S., Raman, S.: LP-3DCNN: unveiling local phase in 3D convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4903–4912 (2019)
Leng, B., Zhang, C., Zhou, X., Xu, C., Xu, K.: Learning discriminative 3D shape representations by view discerning networks. IEEE Trans. Vis. Comput. Graph. 25, 2896–2909 (2018)
Li, J., Chen, B.M., Hee Lee, G.: SO-Net: Self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9397–9406 (2018)
Li, Z., Xu, C., Leng, B.: Angular triplet-center loss for multi-view 3D shape retrieval. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8682–8689 (2019)
Ma, C., Guo, Y., Yang, J., An, W.: Learning multi-view representation with LSTM for 3-D shape recognition and retrieval. IEEE Trans. Multimed. 21(5), 1169–1182 (2018)
Maturana, D., Scherer, S.: Voxnet: a 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928. IEEE (2015)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. arXiv preprint arXiv:1706.02413 (2017)
Ren, M., Niu, L., Fang, Y.: 3D-A-Nets: 3D deep dense descriptor for volumetric shapes with adversarial networks. arXiv preprint arXiv:1711.10108 (2017)
Sfikas, K., Pratikakis, I., Theoharis, T.: Ensemble of panorama-based convolutional neural networks for 3D model classification and retrieval. Comput. Graph. 71, 208–218 (2018)
Sfikas, K., Theoharis, T., Pratikakis, I.: Exploiting the panorama representation for convolutional neural network classification and retrieval. In: Eurographics Workshop on 3D Object Retrieval. The Eurographics Association (2017)
Sinha, A., Bai, J., Ramani, K.: Deep learning 3D shape surfaces using geometry images. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 223–240. Springer (2016)
Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 945–953 (2015)
Su, J.C., Gadelha, M., Wang, R., Maji, S.: A deeper look at 3D shape classifiers. In: Proceedings of the European Conference on Computer Vision (ECCV) (2018)
Wang, C., Pelillo, M., Siddiqi, K.: Dominant set clustering and pooling for multi-view 3D object recognition. In: Proceedings of British Machine Vision Conference (BMVC), vol. 12 (2017)
Wang, D., Yao, H., Tombari, F., Zhao, S., Wang, B., Liu, H.: Learning descriptors with cube loss for view-based 3D object retrieval. IEEE Trans. Multimed. 21, 2071–2082 (2019)
Wen, Y., Zhang, K., Li, Z., Qiao, Y.: A discriminative feature learning approach for deep face recognition. In: European Conference on Computer Vision, pp. 499–515. Springer (2016)
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D ShapeNets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Yann LeCun Corinna Cortes, C.J.B.: The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/ (1998)
Zhou, H.Y., Liu, A.A., Nie, W.Z., Nie, J.: Multi-view saliency guided deep neural network for 3D object retrieval and classification. IEEE Trans. Multimed. 22, 1496–1506 (2019)
Acknowledgements
Funding was provided by National Natural Science Foundation of China (Grant Nos. U1711265 and 61772158) and Self-Planned Task of State Key Laboratory of Robotics and System (Grant No. SKLRS202014B).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, D., Wang, B., Yao, H. et al. Center-push loss for joint view-based 3D model classification and retrieval feature learning. SIViP 17, 873–880 (2023). https://doi.org/10.1007/s11760-021-01923-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-021-01923-4