research-article

Shape2Vec: semantic-based descriptors for 3D shapes, sketches and images

Authors:

Flora Ponjou Tasse,

Neil DodgsonAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 35, Issue 6

Article No.: 208, Pages 1 - 12

https://doi.org/10.1145/2980179.2980253

Published: 05 December 2016 Publication History

Abstract

Convolutional neural networks have been successfully used to compute shape descriptors, or jointly embed shapes and sketches in a common vector space. We propose a novel approach that leverages both labeled 3D shapes and semantic information contained in the labels, to generate semantically-meaningful shape descriptors. A neural network is trained to generate shape descriptors that lie close to a vector representation of the shape class, given a vector space of words. This method is easily extendable to range scans, hand-drawn sketches and images. This makes cross-modal retrieval possible, without a need to design different methods depending on the query type. We show that sketch-based shape retrieval using semantic-based descriptors outperforms the state-of-the-art by large margins, and mesh-based retrieval generates results of higher relevance to the query, than current deep shape descriptors.

Supplementary Material

ZIP File (a208-tasse.zip)

Supplemental file.

Download
93.65 MB

References

[1]

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X., 2015. TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org.

[2]

Bai, S., Bai, X., Zhou, Z., Zhang, Z., and Latecki, L. J. 2016. Gift: A real-time and scalable 3d shape search engine. In CVPR 2016. To appear.

[3]

Biasotti, S., Cerri, A., Bronstein, A., and Bronstein, M. 2015. Recent trends, applications, and perspectives in 3d shape similarity assessment. Computer Graphics Forum.

[4]

Boscaini, D., Masci, J., Rodolà, E., Bronstein, M. M., and Cremers, D. 2016. Anisotropic diffusion descriptors. In Eurographics 2016.

[5]

Chang, A. X., Funkhouser, T., Guibas, L., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., and Yu, F. 2015. ShapeNet: An information-rich 3D model repository. In arXiv.

[6]

Choi, S., Zhou, Q.-Y., Miller, S., and Koltun, V. 2016. A large dataset of object scans. arXiv:1602.02481.

[7]

Duchi, J., Hazan, E., and Singer, Y. 2011. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12 (July), 2121--2159.

Digital Library

[8]

Eitz, M., Hays, J., and Alexa, M. 2012. How do humans sketch objects? ACM Trans. Graph. (Proc. SIGGRAPH) 31, 4, 44:1--44:10.

Digital Library

[9]

Fellbaum, C. 1998. WordNet: An Electronic Lexical Database. Bradford Books.

[10]

Frome, A., Corrado, G. S., Shlens, J., Bengio, S., Dean, J., Ranzato, M., and Mikolov, T. 2013. DeViSE: A deep visual-semantic embedding model. In NIPS'13, 2121--2129.

Digital Library

[11]

Gong, B., Liu, J., Wang, X., and Tang, X. 2013. Learning semantic signatures for 3d object retrieval. Trans. Multi. 15, 2 (Feb.), 369--377.

Digital Library

[12]

Hardoon, D. R., Szedmak, S. R., and Shawe-taylor, J. R. 2004. Canonical correlation analysis: An overview with application to learning methods. Neural Comput. 16, 12, 2639--2664.

Digital Library

[13]

Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. 2014. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093.

[14]

Karpathy, A., 2015. "CS231n: Convolutional Neural Networks for Visual Recognition". http://cs231n.github.io/.

[15]

Kazhdan, M., Funkhouser, T., and Rusinkiewicz, S. 2003. Rotation invariant spherical harmonic representation of 3D shape descriptors. In Symposium on Geometry Processing.

Digital Library

[16]

Krizhevsky, A., Sutskever, I., and Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25. Curran Associates, Inc., 1097--1105.

Digital Library

[17]

Kruskal, J. B. 1964. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29, 1, 1--27.

[18]

Li, B., Lu, Y., Li, C., Godil, A., Schreck, T., Aono, M., Chen, Q., Chowdhury, N., Fang, B., Furuya, T., Johan, H., Kosaka, R., Koyanagi, H., Ohbuchi, R., and Tatsuma, A. 2014. SHREC'14 track: Large Scale Comprehensive 3d shape retrieval. In Proc. EG Workshop on 3D Object Retrieval.

[19]

Li, B., Lu, Y., Li, C., Godil, A., Schreck, T., Aono, M., Fu, H., Furuya, T., Johan, H., Liu, J., Ohbuchi, R., Tatsuma, A., and Zou, C. 2014. SHREC'14 track: Extended large scale sketch-based 3d shape retrieval. In Proc. EG Workshop on 3D Object Retrieval, 2014.

[20]

Li, B., Lu, Y., Li, C., Godil, A., Schreck, T., Aono, M., Burtscher, M., Chen, Q., Chowdhury, N. K., Fang, B., Fu, H., Furuya, T., Li, H., Liu, J., Johan, H., Kosaka, R., Koyanagi, H., Ohbuchi, R., Tatsuma, A., Wan, Y., Zhang, C., and Zou, C. 2015. A comparison of 3d shape retrieval methods based on a large-scale benchmark supporting multimodal queries. Computer Vision and Image Understanding 131, 1 -- 27.

Digital Library

[21]

Li, Y., Su, H., Qi, C. R., Fish, N., Cohen-Or, D., and Guibas, L. J. 2015. Joint embeddings of shapes and images via cnn image purification. ACM Trans. Graph. 34, 6 (Oct.), 234:1--234:12.

Digital Library

[22]

Masci, J., Boscaini, D., Bronstein, M. M., and Vandergheynst, P. 2015. Geodesic convolutional neural networks on riemannian manifolds. In The IEEE International Conference on Computer Vision (ICCV) Workshops.

Digital Library

[23]

Mikolov, T., Chen, K., Corrado, G., and Dean, J. 2013. Efficient estimation of word representations in vector space. ICLR Workshop.

[24]

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., Berg, A. C., and Fei-Fei, L. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115, 3, 211--252.

Digital Library

[25]

Savva, M., Yu, F., Su, H., Aono, M., Chen, B., Cohen-Or, D., Deng, W., Su, H., Bai, S., Bai, X., Fish, N., Han, J., Kalogerakis, E., Learned-Miller, E. G., Li, Y., Liao, M., Maji, S., Wang, Y., Zhang, N., and Zhou, Z. 2016. Large-Scale 3D Shape Retrieval from ShapeNet Core55. In Proc. EG Workshop on 3D Object Retrieval.

[26]

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1929--1958.

Digital Library

[27]

Su, H., Maji, S., Kalogerakis, E., and Learned-Miller, E. G. 2015. Multi-view convolutional neural networks for 3d shape recognition. In Proc. ICCV.

Digital Library

[28]

van der Maaten, L. 2009. Learning a parametric embedding by preserving local structure. In Proc. of AISTATS, vol. 5, 384--391.

[29]

Wang, A., Lu, J., Cai, J., Cham, T. J., and Wang, G. 2015. Large-margin multi-modal deep learning for rgb-d object recognition. IEEE Transactions on Multimedia 17, 11, 1887--1898.

[30]

Wang, F., Kang, L., and Li, Y. 2015. Sketch-based 3D shape retrieval using convolutional neural networks. In CVPR 2015.

[31]

Wu, Z., and Palmer, M. 1994. Verbs semantics and lexical selection. In Proceedings of the 32Nd Annual Meeting on Association for Computational Linguistics, ACL '94, 133--138.

Digital Library

[32]

Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., and Xiao, J. 2015. 3d shapenets: A deep representation for volumetric shapes. In CVPR 2015, 1912--1920.

[33]

Yu, Q., Yang, Y., Song, Y., Xiang, T., and Hospedales, T. 2015. Sketch-a-net that beats humans. In BMVC15, 7.

Cited By

Sun QHe L(2024)Do Similar Objects Have Similar Grasp Positions?Sensors10.3390/s2423773524:23(7735)Online publication date: 3-Dec-2024
https://doi.org/10.3390/s24237735
Zhu CCui DJia QWang WLiu YLew M(2024)Sketch-Based 3D Shape Retrieval With Multi-View Fusion TransformerICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10446103(3005-3009)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSP48485.2024.10446103
Li WBai JZheng H(2024)D2GL: Dual-level dual-scale graph learning for sketch-based 3D shape retrievalPattern Recognition10.1016/j.patcog.2024.110768156(110768)Online publication date: Dec-2024
https://doi.org/10.1016/j.patcog.2024.110768
Show More Cited By

Index Terms

Shape2Vec: semantic-based descriptors for 3D shapes, sketches and images
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
        Shape representations

Recommendations

Spectral Analysis on Medial Axis of 2D Shapes

Shape analysis finds many important applications in shape understanding, matching and retrieval. Among the various shape analysis methods, spectral shape analysis aims to study the spectrum of the Laplace-Beltrami operator of some well-designed shape-...
Novel spectral descriptor for object shape
PCM'10: Proceedings of the 11th Pacific Rim conference on Advances in multimedia information processing: Part I

In this paper, we propose a novel descriptor for shapes. The proposed descriptor is obtained from 3D spherical harmonics. The inadequacy of 2D spherical harmonics is addressed and the method to obtain 3D spherical harmonics is described. 3D spherical ...
Learning a discriminative deformation-invariant 3D shape descriptor via many-to-one encoder

Developing a global shape descriptor using locality-constrained linear coding.Learning a discriminative 3D shape description.Proposed shape descriptor got high performance in the shape retrieval task. Display Omitted Recent advances in 3D acquisition ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 35, Issue 6

November 2016

1045 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/2980179

Issue’s Table of Contents

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2016

Published in TOG Volume 35, Issue 6

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

32
Total Citations
View Citations
905
Total Downloads

Downloads (Last 12 months)22
Downloads (Last 6 weeks)5

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sun QHe L(2024)Do Similar Objects Have Similar Grasp Positions?Sensors10.3390/s2423773524:23(7735)Online publication date: 3-Dec-2024
https://doi.org/10.3390/s24237735
Zhu CCui DJia QWang WLiu YLew M(2024)Sketch-Based 3D Shape Retrieval With Multi-View Fusion TransformerICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10446103(3005-3009)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSP48485.2024.10446103
Li WBai JZheng H(2024)D2GL: Dual-level dual-scale graph learning for sketch-based 3D shape retrievalPattern Recognition10.1016/j.patcog.2024.110768156(110768)Online publication date: Dec-2024
https://doi.org/10.1016/j.patcog.2024.110768
Su YLi WBai JLin G(2024)SKD-SBSR: Structural Knowledge Distillation for Sketch-Based 3D Shape RetrievalKnowledge-Based Systems10.1016/j.knosys.2024.112891(112891)Online publication date: Dec-2024
https://doi.org/10.1016/j.knosys.2024.112891
Liu XJia JLiu C(2023)Survey of lightweighting methods of huge 3D models for online Web3D visualizationVirtual Reality & Intelligent Hardware10.1016/j.vrih.2020.02.0025:5(395-406)Online publication date: Oct-2023
https://doi.org/10.1016/j.vrih.2020.02.002
Bai SBai J(2023) HDA L: Hierarchical Domain-Augmented Adaptive Learning for sketch-based 3D shape retrieval Knowledge-Based Systems10.1016/j.knosys.2023.110302264(110302)Online publication date: Mar-2023
https://doi.org/10.1016/j.knosys.2023.110302
Bai SBai JXu HTuo JLiu M(2023)PAGML: Precise Alignment Guided Metric Learning for sketch-based 3D shape retrievalImage and Vision Computing10.1016/j.imavis.2023.104756136(104756)Online publication date: Aug-2023
https://doi.org/10.1016/j.imavis.2023.104756
Navarro POrlando JDelrieux CIarussi E(2021)SketchZooms: Deep Multi‐view Descriptors for Matching Line DrawingsComputer Graphics Forum10.1111/cgf.1419740:1(410-423)Online publication date: 20-Jan-2021
https://doi.org/10.1111/cgf.14197
Zhou WJia JJiang WHuang C(2021)Sketch Augmentation-Driven Shape Retrieval Learning Framework Based on Convolutional Neural NetworksIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2020.297550427:8(3558-3570)Online publication date: 30-Jun-2021
https://dl.acm.org/doi/10.1109/TVCG.2020.2975504
Zhang SHan ZLai YZwicker MZhang H(2021)Active Arrangement of Small Objects in 3D Indoor ScenesIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2019.294929527:4(2250-2264)Online publication date: 1-Apr-2021
https://doi.org/10.1109/TVCG.2019.2949295
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents