Measuring the Sim2Real Gap in 3D Object Classification for Different 3D Data Representation

Weibel, Jean-Baptiste; Rohrböck, Rainer; Vincze, Markus

doi:10.1007/978-3-030-87156-7_9

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12899))

Included in the following conference series:

International Conference on Computer Vision Systems

984 Accesses
3 Citations

Abstract

Perceiving the environment geometry is necessary for a robot to perform safe motions and actions. To decide upon meaningful actions, however, semantic understanding is also required. At the object level, this semantic classification task can directly be performed using the extracted object 3D data. While continuously improving, the performance of methods designed for this task still decrease on data captured by a robot because of input data differences [18], referred to as the Sim2Real gap. In this paper, we aim to better evaluate that gap for different 3D data representations and understand the impact of a variety of design choices through a set of specific experiments, performed both on the ModelNet dataset [20] to which a variety of alterations is applied and on the ScanObjectNN dataset [18]. Results indicate that occlusions plays an essential part in the gap and that their impact is mitigated by the use of hierarchical representation learned from the surface of the object itself.

The research leading to these results has received funding from the Austrian Science Foundation (FWF) under grant agreement No. I3968-N30 HEAP and No. I3969-N30 InDex.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bernardini, F., Mittleman, J., Rushmeier, H., Silva, C., Taubin, G.: The ball-pivoting algorithm for surface reconstruction. IEEE Trans. Visual. Comput. Graph. 5(4), 349–359 (1999). https://doi.org/10.1109/2945.817351
Article Google Scholar
Choy, C., Gwak, J., Savarese, S.: 4D spatio-temporal convnets: Minkowski convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3075–3084 (2019)
Google Scholar
Feng, Y., Zhang, Z., Zhao, X., Ji, R., Gao, Y.: Gvcnn: group-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 264–272 (2018)
Google Scholar
Han, Z., et al.: 3d2seqviews: aggregating sequential views for 3d global feature learning by CNN with hierarchical attention aggregation. IEEE Trans. Image Process. 28(8), 3986–3999 (2019)
Article MathSciNet Google Scholar
He, Y., Lee, C.H.: An improved ICP registration algorithm by combining pointnet++ and icp algorithm. In: 2020 6th International Conference on Control, Automation and Robotics (ICCAR), pp. 741–745. IEEE (2020)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Proces. Syst. 25, 1097–1105 (2012)
Google Scholar
Ma, C., An, W., Lei, Y., Guo, Y.: Bv-cnns: Binary volumetric convolutional networks for 3d object recognition. In: BMVC. vol. 1, p. 4 (2017)
Google Scholar
Maturana, D., Scherer, S.: Voxnet: A 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928. IEEE (2015)
Google Scholar
Newcombe, R.A., et al.: Kinectfusion: Real-time dense surface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136 (2011). https://doi.org/10.1109/ISMAR.2011.6092378
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)
Google Scholar
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++ deep hierarchical feature learning on point sets in a metric space. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 5105–5114 (2017)
Google Scholar
Riegler, G., Osman Ulusoy, A., Geiger, A.: Octnet: Learning deep 3D representations at high resolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3577–3586 (2017)
Google Scholar
Sfikas, K., Pratikakis, I., Theoharis, T.: Ensemble of panorama-based convolutional neural networks for 3D model classification and retrieval. Comput. Graph. 71, 208–218 (2018)
Article Google Scholar
Shao, L., et al.: Unigrasp: learning a unified model to grasp with multifigured robotic hands. IEEE Robot. Autom. Lett. 5(2), 2286–2293 (2020). https://doi.org/10.1109/LRA.2020.2969946
Article Google Scholar
Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international Conference on Computer Vision, pp. 945–953 (2015)
Google Scholar
Su, J.C., Gadelha, M., Wang, R., Maji, S.: A deeper look at 3D shape classifiers. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2018)
Google Scholar
Thomas, H., et al.: Kpconv: flexible and deformable convolution for point clouds. In: Proceedings of the IEEE International Conference on Computer Vision (2019)
Google Scholar
Uy, M.A., Pham, Q.H., Hua, B.S., Nguyen, D.T., Yeung, S.K.: Revisiting point cloud classification: a new benchmark dataset and classification model on real-world data. In: International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. (tog) 38(5), 1–12 (2019)
Article Google Scholar
Wu, Z., et al.: 3D ShapeNets: a deep representation for volumetric shapes. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)
Google Scholar
Xu, Y., Fan, T., Xu, M., Zeng, L., Qiao, Y.: Spidercnn: deep learning on point sets with parameterized convolutional filters. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 87–102 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Vision for Robotics Laboratory, Automation and Control Institute, TU Wien, Vienna, Austria
Jean-Baptiste Weibel, Rainer Rohrböck & Markus Vincze

Authors

Jean-Baptiste Weibel
View author publications
You can also search for this author in PubMed Google Scholar
Rainer Rohrböck
View author publications
You can also search for this author in PubMed Google Scholar
Markus Vincze
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jean-Baptiste Weibel .

Editor information

Editors and Affiliations

TU Wien, Vienna, Austria
Markus Vincze
University of Technology Sydney, Sydney, Australia
Timothy Patten
University of California San Diego, La Jolla, CA, USA
Henrik I Christensen
Technical University of Denmark, Kongens Lyngby, Denmark
Lazaros Nalpantidis
Hong Kong University of Science and Technology, Hong Kong, China
Ming Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Weibel, JB., Rohrböck, R., Vincze, M. (2021). Measuring the Sim2Real Gap in 3D Object Classification for Different 3D Data Representation. In: Vincze, M., Patten, T., Christensen, H.I., Nalpantidis, L., Liu, M. (eds) Computer Vision Systems. ICVS 2021. Lecture Notes in Computer Science(), vol 12899. Springer, Cham. https://doi.org/10.1007/978-3-030-87156-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-87156-7_9
Published: 19 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87155-0
Online ISBN: 978-3-030-87156-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics