Semantically guided projection for zero-shot 3D model classification and retrieval

Su, Yuting; Li, Jiayu; Li, Wenhui; Gao, Zan; Chen, Haipeng; Li, Xuanya; Liu, An-An

doi:10.1007/s00530-022-00970-2

Semantically guided projection for zero-shot 3D model classification and retrieval

Regular Paper
Published: 16 July 2022

Volume 28, pages 2437–2451, (2022)
Cite this article

Multimedia Systems Aims and scope Submit manuscript

Yuting Su¹,
Jiayu Li ORCID: orcid.org/0000-0002-3458-2884¹,
Wenhui Li¹,
Zan Gao³,
Haipeng Chen⁴,
Xuanya Li⁵ &
…
An-An Liu^1,2

456 Accesses
2 Citations
Explore all metrics

Abstract

The most existing methods for 3D model classification and retrieval rely on the fully supervised training scheme, which are prohibitive and time-consuming to collect and label 3D models of wide different categories. How to make full use of the existing known data to represent the unknown data is a crucial topic. Inspired by the zero-shot learning in 2D image domain, we propose the semantically guided projection method to classify and retrieve unseen 3D models by exploring the semantic relationship between seen and unseen 3D models. First, we explore the multi-view information of 3D models to construct the semantic attributes as the prior knowledge to represent 3D models. Then, we learn bidirectional projections from visual features to semantics and from semantics to visual features, which can eliminate the gap between seen and unseen domains. Extensive experiments for zero-shot 3D model classification and retrieval on two popular datasets, ModelNet40 and ShapeNetCore55, have demonstrated the effectiveness and superiority of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A study on zero-shot learning from semantic viewpoint

Article 30 May 2022

Zero3D: Semantic-Driven 3D Shape Generation for Zero-Shot Learning

Image-driven unsupervised 3D model co-segmentation

Article 07 May 2019

References

Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for image classification. IEEE Trans. Pattern Anal. Mach. Intell. 38(7), 1425–1438 (2016)
Article Google Scholar
Chang, A.X., Funkhouser, T.A., Guibas, L.J., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., Yu, F.: Shapenet: An information-rich 3d model repository. CoRR abs/1512.03012 (2015)
Chi, J., Peng, Y.: Zero-shot cross-media embedding learning with dual adversarial distribution network. IEEE Trans. Circuits Syst. Video Technol. 30(4), 1173–1187 (2020)
Article Google Scholar
Dai, G., Xie, J., Fang, Y.: Siamese cnn-bilstm architecture for 3d shape representation learning. In: IJCAI, pp. 670–676 (2018)
Elhoseiny, M., Saleh, B., Elgammal, A.M.: Write a classifier: zero-shot learning using purely textual descriptions. In: ICCV, pp. 2584–2591 (2013)
Feng, Y., Zhang, Z., Zhao, X., Ji, R., Gao, Y.: GVCNN: group-view convolutional neural networks for 3d shape recognition. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 264–272 (2018)
Feng, Y., Zhang, Z., Zhao, X., Ji, R., Gao, Y.: GVCNN: group-view convolutional neural networks for 3d shape recognition. In: CVPR, pp. 264–272 (2018)
Han, Z., Shang, M., Liu, Y., Zwicker, M.: View inter-prediction GAN: unsupervised representation learning for 3d shapes by learning global shape memories to support local view predictions. In: The Thirty-Third AAAI Conference on Artificial Intelligence, pp. 8376–8384 (2019)
Huang, H., Wang, C., Yu, P.S., Wang, C.: Generative dual adversarial network for generalized zero-shot learning. In: CVPR, pp. 801–810 (2019)
Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)
Article Google Scholar
Kampffmeyer, M., Chen, Y., Liang, X., Wang, H., Zhang, Y., Xing, E.P.: Rethinking knowledge graph propagation for zero-shot learning. In: CVPR, pp. 11487–11496 (2019)
Ko, Y.: A study of term weighting schemes using class information for text classification. In: The 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’12, Portland, OR, USA, August 12–16, 2012, pp. 1029–1030 (2012)
Kodirov, E., Xiang, T., Gong, S.: Semantic autoencoder for zero-shot learning. In: CVPR, pp. 4447–4456 (2017)
Kwon, G., Al Regib, G.: A gating model for bias calibration in generalized zero-shot learning. IEEE Transactions on Image Processing (2022)
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR, pp. 951–958 (2009)
Larochelle, H., Erhan, D., Bengio, Y.: Zero-data learning of new tasks. In: AAAI, pp. 646–651 (2008)
Lei Ba, J., Swersky, K., Fidler, S., et al.: Predicting deep zero-shot convolutional neural networks using textual descriptions. In: ICCV, pp. 4247–4255 (2015)
Li, J., Jing, M., Lu, K., Ding, Z., Zhu, L., Huang, Z.: Leveraging the invariant side of generative zero-shot learning. In: CVPR, pp. 7402–7411 (2019)
Li, F., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR, pp. 524–531 (2005)
Liu, L., Wu, S., Chen, R., Zhou, M.: Zero-shot image classification via coupled discriminative dictionary learning. In: ICSEE, pp. 363–372 (2017)
Liu, A., Nie, W., Su, Y.: 3d object retrieval based on multi-view latent variable model. IEEE Trans. Circuits Syst. Video Technol. 29(3), 868–880 (2019)
Article Google Scholar
Liu, A., Zhou, H., Nie, W., Liu, Z., Liu, W., Xie, H., Mao, Z., Li, X., Song, D.: Hierarchical multi-view context modelling for 3d object classification and retrieval. Inf. Sci. 547, 984–995 (2021)
Article Google Scholar
Ma, Y., Yu, D., Wu, T., Wang, H.: Paddlepaddle: an open-source deep learning platform from industrial practice. Front. Data Domput. 1(1), 105–115 (2019)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)
Paddlepaddle: Paddlepaddle: An Easy-to-Use, Easy-to-Learn Deep Learning Platform. http://www.paddlepaddle.org/
Parikh, D., Grauman, K.: Relative attributes. In: ICCV, pp. 503–510 (2011)
Qiao, R., Liu, L., Shen, C., van den Hengel, A.: Less is more: zero-shot learning from online textual documents with noise suppression. In: CVPR, pp. 2249–2257 (2016)
Reed, S.E., Akata, Z., Lee, H., Schiele, B.: Learning deep representations of fine-grained visual descriptions. In: CVPR, pp. 49–58 (2016)
Rohrbach, M., Stark, M., Szarvas, G., Gurevych, I., Schiele, B.: What helps where - and why? semantic relatedness for knowledge transfer. In: CVPR, pp. 910–917 (2010)
Sariyildiz, M.B., Cinbis, R.G.: Gradient matching generative networks for zero-shot learning. In: CVPR, pp. 2168–2178 (2019)
Schönfeld, E., Ebrahimi, S., Sinha, S., Darrell, T., Akata, Z.: Generalized zero- and few-shot learning via aligned variational autoencoders. In: CVPR, pp. 8247–8255 (2019)
Sivic, J., Zisserman, A.: Efficient visual search of videos cast as text retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 591–606 (2009)
Article Google Scholar
Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.G.: Multi-view convolutional neural networks for 3d shape recognition. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7=-13, 2015, pp. 945–953 (2015)
Tian, Y., Kong, Y., Ruan, Q., An, G., Fu, Y.: Aligned dynamic-preserving embedding for zero-shot action recognition. IEEE Trans. Circuits Syst. Video Technol. 30(6), 1597–1612 (2020)
Article Google Scholar
Visualizing data using t-sne: Maaten, L.v.d., Hinton, G. Journal of machine learning research 9(Nov), 2579–2605 (2008)
Wang, D., Li, Y., Lin, Y., Zhuang, Y.: Relational knowledge transfer for zero-shot learning. In: AAAI, pp. 2145–2151 (2016)
Wang, X., Ye, Y., Gupta, A.: Zero-shot recognition via semantic embeddings and knowledge graphs. In: CVPR, pp. 6857–6866 (2018)
Wang, W., Zheng, V.W., Yu, H., Miao, C.: A survey of zero-shot learning: settings, methods, and applications. ACM Trans. Intell. Syst. Technol. (TIST) 10(2), 13 (2019)
Google Scholar
Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D shapenets: a deep representation for volumetric shapes. In: CVPR, pp. 1912–1920 (2015)
Wu, T., Wang, H., Ma, Y., Yu, D.: Paddlepaddle: an open-source deep learning platform from industrial practice. Front. Data Comput. 1, 105–115 (2019)
Google Scholar
Xian, Y., Akata, Z., Sharma, G., Nguyen, Q., Hein, M., Schiele, B.: Latent embeddings for zero-shot classification. In: CVPR, pp. 69–77 (2016)
Xu, C., Li, Z., Qiu, Q., Leng, B., Jiang, J.: Enhancing 2D representation via adjacent views for 3D shape retrieval. In: ICCV, pp. 3732–3740 (2019)
Zhang, L., Wang, P., Liu, L., Shen, C., Wei, W., Zhang, Y., van den Hengel, A.: Towards effective deep embedding for zero-shot learning. IEEE Trans. Circuits Syst. Video Technol. 30(9), 2843–2852 (2020)
Article Google Scholar
Zhao, A., Ding, M., Guan, J., Lu, Z., Xiang, T., Wen, J.: Domain-invariant projection learning for zero-shot recognition. In: NIPS, pp. 1027–1038 (2018)
Zhao, B., Wu, B., Wu, T., Wang, Y.: Zero-shot learning posed as a missing data problem. In: ICCV, pp. 2616–2622 (2017)
Zheng, V.W., Hu, D.H., Yang, Q.: Cross-domain activity recognition. In: UbiComp 2009: Ubiquitous Computing, 11th International Conference, UbiComp 2009, Orlando, Florida, USA, September 30 - October 3, 2009, Proceedings, pp. 61–70 (2009)

Download references

Acknowledgements

This work was supported in part by the National Key Research and Development Program of China (2020YFB1709201), the National Natural Science Foundation of China (U21B2024, 61802277), and the Baidu Program.

Author information

Authors and Affiliations

The School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China
Yuting Su, Jiayu Li, Wenhui Li & An-An Liu
The Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, 230088, China
An-An Liu
Qilu University of Technology (Shandong Academy of Sciences), Shandong Artificial Intelligence Institute P.R China, Jinan, China
Zan Gao
The School College of Computer Science and Technology, Jilin University, Jilin , China
Haipeng Chen
Baidu Inc., Beijing, China
Xuanya Li

Authors

Yuting Su
View author publications
Search author on:PubMed Google Scholar
Jiayu Li
View author publications
Search author on:PubMed Google Scholar
Wenhui Li
View author publications
Search author on:PubMed Google Scholar
Zan Gao
View author publications
Search author on:PubMed Google Scholar
Haipeng Chen
View author publications
Search author on:PubMed Google Scholar
Xuanya Li
View author publications
Search author on:PubMed Google Scholar
An-An Liu
View author publications
Search author on:PubMed Google Scholar

Corresponding authors

Correspondence to Wenhui Li or An-An Liu.

Additional information

Communicated by B.-K. Bao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Su, Y., Li, J., Li, W. et al. Semantically guided projection for zero-shot 3D model classification and retrieval. Multimedia Systems 28, 2437–2451 (2022). https://doi.org/10.1007/s00530-022-00970-2

Download citation

Received: 24 September 2021
Accepted: 13 June 2022
Published: 16 July 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s00530-022-00970-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+

from $39.99 /Month

Starting from 10 chapters or articles per month
Access and download chapters and articles from more than 300k books and 2,500 journals
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semantically guided projection for zero-shot 3D model classification and retrieval

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A study on zero-shot learning from semantic viewpoint

Zero3D: Semantic-Driven 3D Shape Generation for Zero-Shot Learning

Image-driven unsupervised 3D model co-segmentation

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now