research-article

Self-Supervised 3D Mesh Object Retrieval

Authors:
Kajal Sanklecha

Center of Visual Information Technology, International Institute of Information Technology, Hyderabad, IN

Center of Visual Information Technology, International Institute of Information Technology, Hyderabad, IN

0009-0004-5989-4245
View Profile

,
Prayushi Mathur

Center of Visual Information Technology, International Institute of Information Technology, Hyderabad, IN

Center of Visual Information Technology, International Institute of Information Technology, Hyderabad, IN

0000-0003-2102-5146
View Profile

,
P. J. Narayanan

Center of Visual Information Technology, IIIT-Hyderabad, IN

Center of Visual Information Technology, IIIT-Hyderabad, IN

0000-0002-7164-4917
View Profile

ICVGIP '23: Proceedings of the Fourteenth Indian Conference on Computer Vision, Graphics and Image ProcessingDecember 2023Article No.: 26Pages 1–10https://doi.org/10.1145/3627631.3627657

Published:31 January 2024Publication History

ICVGIP '23: Proceedings of the Fourteenth Indian Conference on Computer Vision, Graphics and Image Processing

Pages 1–10

ABSTRACT

Digital representations of 3D objects are increasingly being used for engineering, entertainment, education, etc. Efforts to search and retrieve digital 3D models from a collection have not attracted sufficient attention, unlike digital representations of documents, images, etc. Supervised methods are not feasible to solve this problem as a large collection of labelled 3D objects is difficult to create. This paper presents a self-supervised method to learn efficient embeddings of 3D mesh objects for ranked retrieval of similar objects. We propose a simple representation of mesh objects and an encoder-decoder architecture to learn the embedding. Extensive experiments show that our method is competitive with methods that need supervision while being more scalable to different object collections.

Supplemental Material

Available for Download

zip

icvgup23-26-supplementary.zip (28.3 MB)

Supplementary material for the paper titled "Self-Supervised Mesh Object Retrieval". Paper ID - 26.

References

[n.d.]. 3D Warehouse. https://3dwarehouse.sketchup.com/?hl=en.Google Scholar
[n.d.]. GrabCad Community Library. https://grabcad.com/library.Google Scholar
[n.d.]. Sketchfab 3D Models Store. https://sketchfab.com/store/3d-models.Google Scholar
[n.d.]. Turbosquid by Shutterstock. https://www.turbosquid.com/.Google Scholar
[n.d.]. UltiMaker Thingiverse. https://www.thingiverse.com/.Google Scholar
Amin Abolghasemi, Suzan Verberne, and Leif Azzopardi. 2022. Improving BERT-based query-by-document retrieval with multi-task optimization. In Advances in Information Retrieval: 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II. Springer, 3–12.Google ScholarDigital Library
Ashutosh Adhikari, Achyudh Ram, Raphael Tang, and Jimmy Lin. 2019. Docbert: Bert for document classification. arXiv preprint arXiv:1904.08398 (2019).Google Scholar
Gianni Amati and Cornelis Joost Van Rijsbergen. 2002. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Transactions on Information Systems (TOIS) 20, 4 (2002), 357–389.Google ScholarDigital Library
Alexei Baevski, Yuhao Zhou, Abdelrahman Mohamed, and Michael Auli. 2020. wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems 33 (2020), 12449–12460.Google Scholar
Song Bai, Xiang Bai, Zhichao Zhou, Zhaoxiang Zhang, and Longin Jan Latecki. 2016. Gift: A real-time and scalable 3d shape search engine. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5023–5032.Google ScholarCross Ref
Aditya Bharti, NB Vineeth, and CV Jawahar. 2020. Few shot learning with no labels. arXiv preprint arXiv:2012.13751 (2020).Google Scholar
Sergey Brin and Lawrence Page. 1998. The anatomy of a large-scale hypertextual web search engine. Computer networks and ISDN systems 30, 1-7 (1998), 107–117.Google Scholar
Alexander M Bronstein, Michael M Bronstein, Leonidas J Guibas, and Maks Ovsjanikov. 2011. Shape google: Geometric words and expressions for invariant shape retrieval. ACM Transactions on Graphics (TOG) 30, 1 (2011), 1–20.Google ScholarDigital Library
Michael M Bronstein and Iasonas Kokkinos. 2010. Scale-invariant heat kernel signatures for non-rigid shape recognition. In 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, 1704–1711.Google Scholar
Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to rank using gradient descent. In Proceedings of the 22nd international conference on Machine learning. 89–96.Google ScholarDigital Library
Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. 2020. Unsupervised learning of visual features by contrasting cluster assignments. Advances in neural information processing systems 33 (2020), 9912–9924.Google Scholar
Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianxiong Xiao, Li Yi, and Fisher Yu. 2015. ShapeNet: An Information-Rich 3D Model Repository. Technical Report arXiv:1512.03012 [cs.GR]. Stanford University — Princeton University — Toyota Technological Institute at Chicago.Google Scholar
Ding-Yun Chen, Xiao-Pei Tian, Yu-Te Shen, and Ming Ouhyoung. 2003. On visual similarity based 3D model retrieval. In Computer graphics forum, Vol. 22. Wiley Online Library, 223–232.Google Scholar
Haolan Chen, Shitong Luo, Xiang Gao, and Wei Hu. 2021. Unsupervised learning of geometric sampling invariant representations for 3d point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 893–903.Google ScholarCross Ref
Siheng Chen, Chaojing Duan, Yaoqing Yang, Duanshun Li, Chen Feng, and Dong Tian. 2019. Deep unsupervised learning of 3D point clouds via graph topology inference and filtering. IEEE transactions on image processing 29 (2019), 3183–3198.Google Scholar
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597–1607.Google Scholar
Kevin Clark, Minh-Thang Luong, Quoc V Le, and Christopher D Manning. 2020. Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020).Google Scholar
Andrew M Dai, Christopher Olah, and Quoc V Le. 2015. Document embedding with paragraph vectors. arXiv preprint arXiv:1507.07998 (2015).Google Scholar
Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, and Ali Farhadi. 2023. Objaverse: A universe of annotated 3d objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13142–13153.Google ScholarCross Ref
David L Donoho 2000. High-dimensional data analysis: The curses and blessings of dimensionality. AMS math challenges lecture 1, 2000 (2000), 32.Google Scholar
Susan T Dumais 2004. Latent semantic analysis. Annu. Rev. Inf. Sci. Technol. 38, 1 (2004), 188–230.Google ScholarCross Ref
Mathias Eitz, Ronald Richter, Tamy Boubekeur, Kristian Hildebrand, and Marc Alexa. 2012. Sketch-based shape retrieval. ACM Transactions on graphics (TOG) 31, 4 (2012), 1–10.Google Scholar
Yutong Feng, Yifan Feng, Haoxuan You, Xibin Zhao, and Yue Gao. 2019. Meshnet: Mesh neural network for 3d shape representation. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 8279–8286.Google ScholarDigital Library
Zan Gao, Haixin Xue, and Shaohua Wan. 2020. Multiple discrimination and pairwise CNN for view-based 3D object retrieval. Neural Networks 125 (2020), 290–302.Google ScholarCross Ref
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, 2020. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems 33 (2020), 21271–21284.Google Scholar
Abdullah Hamdi, Silvio Giancola, and Bernard Ghanem. 2021. Mvtn: Multi-view transformation network for 3d shape recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1–11.Google ScholarCross Ref
Zhizhong Han, Zhenbao Liu, Chi-Man Vong, Yu-Shen Liu, Shuhui Bu, Junwei Han, and CL Philip Chen. 2018. Deep spatiality: Unsupervised learning of spatially-enhanced global and local 3D features by deep neural network with coupled softmax. IEEE Transactions on Image Processing 27, 6 (2018), 3049–3063.Google ScholarCross Ref
Rana Hanocka, Amir Hertz, Noa Fish, Raja Giryes, Shachar Fleishman, and Daniel Cohen-Or. 2019. Meshcnn: a network with an edge. ACM Transactions on Graphics (TOG) 38, 4 (2019), 1–12.Google ScholarDigital Library
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729–9738.Google ScholarCross Ref
Xinwei He, Tengteng Huang, Song Bai, and Xiang Bai. 2019. View n-gram network for 3d object retrieval. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7515–7524.Google ScholarCross Ref
R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. 2018. Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670 (2018).Google Scholar
Jingwei Huang, Hao Su, and Leonidas Guibas. 2018. Robust watertight manifold surface generation method for shapenet models. arXiv preprint arXiv:1802.01698 (2018).Google Scholar
Hervé Jégou, Matthijs Douze, Cordelia Schmid, and Patrick Pérez. 2010. Aggregating local descriptors into a compact image representation. In 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, 3304–3311.Google Scholar
Jianwen Jiang, Di Bao, Ziqiang Chen, Xibin Zhao, and Yue Gao. 2019. MLVCNN: Multi-loop-view convolutional neural network for 3D shape retrieval. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 8513–8520.Google ScholarDigital Library
Andrew E Johnson. 1997. Spin-images: a representation for 3-D surface matching. (1997).Google Scholar
Seonggyeom Kim and Dong-Kyu Chae. 2022. ExMeshCNN: An Explainable Convolutional Neural Network Architecture for 3D Shape Analysis. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 795–803.Google ScholarDigital Library
Jan Knopp, Mukta Prasad, Geert Willems, Radu Timofte, and Luc Van Gool. 2010. Hough transform and 3D SURF for robust three dimensional classification. In Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part VI 11. Springer, 589–602.Google ScholarCross Ref
Guillaume Lavoué and Christian Wolf. 2008. Markov Random Fields for Improving 3D Mesh Analysis and Segmentation.. In 3DOR@ Eurographics. 25–32.Google Scholar
Huan Lei, Naveed Akhtar, and Ajmal Mian. 2021. Picasso: A CUDA-based library for deep learning over 3d meshes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13854–13864.Google ScholarCross Ref
Bo Li, Yijuan Lu, Azeem Ghumman, Bradley Strylowski, Mario Gutierrez, Safiyah Sadiq, Scott Forster, Natacha Feola, and Travis Bugerin. 2015. 3D sketch-based 3D model retrieval. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. 555–558.Google ScholarDigital Library
Yangyan Li, Rui Bu, Mingchao Sun, Wei Wu, Xinhan Di, and Baoquan Chen. 2018. Pointcnn: Convolution on x-transformed points. Advances in neural information processing systems 31 (2018).Google Scholar
Yaqian Liang, Shanshan Zhao, Baosheng Yu, Jing Zhang, and Fazhi He. 2022. MeshMAE: Masked Autoencoders for 3D Mesh Data Analysis. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part III. Springer, 37–54.Google Scholar
Minghua Liu, Ruoxi Shi, Kaiming Kuang, Yinhao Zhu, Xuanlin Li, Shizhong Han, Hong Cai, Fatih Porikli, and Hao Su. 2023. OpenShape: Scaling Up 3D Shape Representation Towards Open-World Understanding.Google Scholar
Xinhai Liu, Zhizhong Han, Xin Wen, Yu-Shen Liu, and Matthias Zwicker. 2019. L2g auto-encoder: Understanding point clouds by local-to-global reconstruction with hierarchical self-attention. In Proceedings of the 27th ACM International Conference on Multimedia. 989–997.Google ScholarDigital Library
Yongcheng Liu, Bin Fan, Gaofeng Meng, Jiwen Lu, Shiming Xiang, and Chunhong Pan. 2019. Densepoint: Learning densely contextual representation for efficient point cloud processing. In Proceedings of the IEEE/CVF international conference on computer vision. 5239–5248.Google ScholarCross Ref
William E Lorensen and Harvey E Cline. 1987. Marching cubes: A high resolution 3D surface construction algorithm. ACM siggraph computer graphics 21, 4 (1987), 163–169.Google Scholar
Bharadwaj Manda, Shubham Dhayarkar, Sai Mitheran, VK Viekash, and Ramanathan Muthuganapathy. 2021. ‘CADSketchNet’-An annotated sketch dataset for 3D CAD model retrieval with deep neural networks. Computers & Graphics 99 (2021), 100–113.Google ScholarDigital Library
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2021. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65, 1 (2021), 99–106.Google ScholarDigital Library
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).Google Scholar
Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 165–174.Google ScholarCross Ref
Florent Perronnin, Jorge Sánchez, and Thomas Mensink. 2010. Improving the fisher kernel for large-scale image classification. In Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IV 11. Springer, 143–156.Google ScholarCross Ref
Anran Qi, Yi-Zhe Song, and Tao Xiang. 2018. Semantic Embedding for Sketch-Based 3D Shape Retrieval.. In BMVC, Vol. 3. 11–12.Google Scholar
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 652–660.Google Scholar
Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Advances in neural information processing systems 30 (2017).Google Scholar
Filip Radenović, Giorgos Tolias, and Ondřej Chum. 2018. Fine-tuning CNN image retrieval with no human annotation. IEEE transactions on pattern analysis and machine intelligence 41, 7 (2018), 1655–1668.Google Scholar
Stephen Robertson, Hugo Zaragoza, 2009. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends® in Information Retrieval 3, 4 (2009), 333–389.Google ScholarDigital Library
Paul Scovanner, Saad Ali, and Mubarak Shah. 2007. A 3-dimensional sift descriptor and its application to action recognition. In Proceedings of the 15th ACM international conference on Multimedia. 357–360.Google ScholarDigital Library
Avinash Sharma, Radu Horaud, Jan Cech, and Edmond Boyer. 2011. Topologically-robust 3D shape matching based on diffusion geometry and seed growing. In CVPR 2011. 2481–2488. https://doi.org/10.1109/CVPR.2011.5995455Google ScholarDigital Library
Charu Sharma and Manohar Kaul. 2020. Self-supervised few-shot learning on point clouds. Advances in Neural Information Processing Systems 33 (2020), 7212–7221.Google Scholar
Ravi Shekhar and CV Jawahar. 2012. Word image retrieval using bag of visual words. In 2012 10th IAPR International Workshop on Document Analysis Systems. IEEE, 297–301.Google ScholarDigital Library
Yi Shi, Mengchen Xu, Shuaihang Yuan, and Yi Fang. 2020. Unsupervised deep shape descriptor with point distribution learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9353–9362.Google ScholarCross Ref
Vinit Veerendraveer Singh, Shivanand Venkanna Sheshappanavar, and Chandra Kambhamettu. 2021. MeshNet++: A Network with a Face.. In ACM Multimedia. 4883–4891.Google Scholar
Hang Su, Subhransu Maji, Evangelos Kalogerakis, and Erik Learned-Miller. 2015. Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE international conference on computer vision. 945–953.Google ScholarDigital Library
Jian Sun, Maks Ovsjanikov, and Leonidas Guibas. 2009. A concise and provably informative multi-scale signature based on heat diffusion. In Computer graphics forum, Vol. 28. Wiley Online Library, 1383–1392.Google Scholar
Yaniv Taigman, Ming Yang, Marc’Aurelio Ranzato, and Lior Wolf. 2014. DeepFace: Closing the Gap to Human-Level Performance in Face Verification. In 2014 IEEE Conference on Computer Vision and Pattern Recognition. 1701–1708. https://doi.org/10.1109/CVPR.2014.220Google ScholarDigital Library
Ali Thabet, Humam Alwassel, and Bernard Ghanem. 2020. Self-supervised learning of local features in 3d point clouds. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 938–939.Google ScholarCross Ref
Bart Iver van Blokland and Theoharis Theoharis. 2018. Microshapes: efficient querying of 3D object collections based on local shape. In Proceedings of the 11th Eurographics Workshop on 3D Object Retrieval. 9–16.Google Scholar
Bart Iver van Blokland and Theoharis Theoharis. 2020. An indexing scheme and descriptor for 3D object retrieval based on local shape querying. Computers & Graphics 92 (2020), 55–66.Google ScholarCross Ref
Fang Wang, Le Kang, and Yi Li. 2015. Sketch-based 3d shape retrieval using convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1875–1883.Google ScholarCross Ref
Hong Wang, Xin Wang, Wenhan Xiong, Mo Yu, Xiaoxiao Guo, Shiyu Chang, and William Yang Wang. 2019. Self-supervised learning for contextualized extractive summarization. arXiv preprint arXiv:1906.04466 (2019).Google Scholar
Liwei Wang, Yin Li, and Svetlana Lazebnik. 2016. Learning deep structure-preserving image-text embeddings. In Proceedings of the IEEE conference on computer vision and pattern recognition. 5005–5013.Google ScholarCross Ref
Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1912–1920.Google Scholar
Zhirong Wu, Yuanjun Xiong, Stella X Yu, and Dahua Lin. 2018. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3733–3742.Google ScholarCross Ref
Yu Xiang, Wonhui Kim, Wei Chen, Jingwei Ji, Christopher Choy, Hao Su, Roozbeh Mottaghi, Leonidas Guibas, and Silvio Savarese. 2016. Objectnet3d: A large scale database for 3d object recognition. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VIII 14. Springer, 160–176.Google ScholarCross Ref
Jin Xie, Guoxian Dai, Fan Zhu, Edward K Wong, and Yi Fang. 2016. Deepshape: Deep-learned shape descriptor for 3d shape retrieval. IEEE transactions on pattern analysis and machine intelligence 39, 7 (2016), 1335–1345.Google Scholar
Le Xue, Mingfei Gao, Chen Xing, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, Ran Xu, Juan Carlos Niebles, and Silvio Savarese. 2023. ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Le Xue, Ning Yu, Shu Zhang, Junnan Li, Roberto Martín-Martín, Jiajun Wu, Caiming Xiong, Ran Xu, Juan Carlos Niebles, and Silvio Savarese. 2023. ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding.Google Scholar
Andrew Yates, Rodrigo Nogueira, and Jimmy Lin. 2021. Pretrained transformers for text ranking: BERT and beyond. In Proceedings of the 14th ACM International Conference on web search and data mining. 1154–1156.Google ScholarDigital Library
Haoxuan You, Yifan Feng, Rongrong Ji, and Yue Gao. 2018. Pvnet: A joint convolutional network of point cloud and multi-view for 3d shape recognition. In Proceedings of the 26th ACM international conference on Multimedia. 1310–1318.Google ScholarDigital Library
Zaiwei Zhang, Rohit Girdhar, Armand Joulin, and Ishan Misra. 2021. Self-supervised pretraining of 3d features on any point-cloud. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 10252–10263.Google ScholarCross Ref

Index Terms

Self-Supervised 3D Mesh Object Retrieval
1. Computing methodologies
  1. Computer graphics
    1. Shape modeling
  2. Machine learning
    1. Machine learning approaches
      1. Learning latent representations

Recommendations

Self-supervised learning for robust object retrieval without human annotations
Abstract
This paper explores the potential of self-supervised learning as an alternative to supervised learning in the context of geometry-based 3D object retrieval. With the ongoing digitalization of many industries, an exponentially increasing number of ...
Graphical abstract

Display Omitted
Highlights
- Self-supervised learning is on par with supervised learning for 3D object retrieval.
- Self-supervised object retrieval outperforms supervised learning out-of-distribution.
- Object alignment is highly impactful on the performance of ...
Read More
Coreference aware web object retrieval
CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management

As user demands become increasingly sophisticated, search engines today are competing in more than just returning document results from the Web. One area of competition is providing web object results from structured data extracted from a multitude of ...
Read More
Understanding object descriptions in robotics by open-vocabulary object retrieval and detection

We address the problem of retrieving and detecting objects based on open-vocabulary natural language queries: given a phrase describing a specific object, for example “the corn flakes box”, the task is to find the best match in a set of images containing ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICVGIP '23: Proceedings of the Fourteenth Indian Conference on Computer Vision, Graphics and Image Processing
December 2023
352 pages
ISBN:9798400716256
DOI:10.1145/3627631
Editors:
Rahul Narain,
Kaushik Mitra,
Ian Reid
Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 31 January 2024
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
3D Triangle Mesh
Embedding Space
Mesh Analysis
Object Retrieval
Ranked Retrieval
Self-Supervision
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate95of286submissions,33%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 63
  Total Downloads
- Downloads (Last 12 months)63
- Downloads (Last 6 weeks)20
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Self-Supervised 3D Mesh Object Retrieval

ICVGIP '23: Proceedings of the Fourteenth Indian Conference on Computer Vision, Graphics and Image Processing

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Self-supervised learning for robust object retrieval without human annotations

Coreference aware web object retrieval

Understanding object descriptions in robotics by open-vocabulary object retrieval and detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Self-Supervised 3D Mesh Object Retrieval

ICVGIP '23: Proceedings of the Fourteenth Indian Conference on Computer Vision, Graphics and Image Processing

ABSTRACT

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Self-supervised learning for robust object retrieval without human annotations

Coreference aware web object retrieval

Understanding object descriptions in robotics by open-vocabulary object retrieval and detection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media