A one-shot next best view system for active object recognition

Hoseini, Pourya; Paul, Shuvo Kumar; Nicolescu, Mircea; Nicolescu, Monica

doi:10.1007/s10489-021-02657-z

A one-shot next best view system for active object recognition

Published: 09 August 2021

Volume 52, pages 5290–5309, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

590 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Active vision is the ability of intelligent agents to dynamically gather more information about their surroundings by physical motion of the camera. In the case of object recognition, active vision enables improved performance by incorporating classification decisions from new viewpoints when there is some degree of uncertainty in the current recognition result. A natural question in an autonomous active vision system is, nonetheless, how to determine the new viewpoint, i.e. in what pose should the camera be moved? This is the traditional question of next best view in active perception systems. Current approaches to the next best view problem either need construction of occupancy grids or require training datasets of 3D objects or multiple captures of the same object in specified poses. Occupancy grid methods are usually dependent on multiple camera movements to perform well, which make them more useful for 3D reconstruction applications than object recognition. In this paper, a next best view method for active object recognition based on object appearance and surface direction is proposed that decides on the next cameras pose without requiring any specifically structured training datasets of 3D objects. It is also designed for single-shot deductions of next viewpoint and is able to determine next best views without the need for substantial knowledge of 3D voxels in the environment around the camera. The experimental results illustrate the efficiency of the proposed method, while showing large improvements in accuracy and F₁ score.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Next Best View Planning in a Single Glance: An Approach to Improve Object Recognition

Article 11 November 2022

View planning in robot active vision: A survey of systems, algorithms, and applications

Article Open access 01 August 2020

Active Scene Classification via Dynamically Learning Prototypical Views

Notes

Dataset available at https://github.com/pouryahoseini/Next-Best-View-Dataset.

References

Almadhoun R, Abduldayem A, Taha T, Seneviratne L, Zweiri Y (2019) Guided next best view for 3d reconstruction of large complex structures. Remote Sens 11(20):2440
Article Google Scholar
Atanasov N, Sankaran B, Le Ny J, Pappas GJ, Daniilidis K (2014) Nonmyopic view planning for active object classification and pose estimation. IEEE Trans Robot 30(5):1078–1090
Article Google Scholar
Bajcsy R, Aloimonos Y, Tsotsos JK (2018) Revisiting active perception. Auton Robot 42 (2):177–196
Article Google Scholar
Barzilay O, Zelnik-Manor L, Gutfreund Y, Wagner H, Wolf A (2017) From biokinematics to a robotic active vision system. Bioinspir Biomim 12(5):056004
Article Google Scholar
Bircher A, Kamel M, Alexis K, Oleynikova H, Siegwart R (2016) Receding horizon” next-best-view” planner for 3d exploration. In: 2016 IEEE international conference on robotics and automation (ICRA), IEEE, pp 1462–1468
Cui J, Wen JT, Trinkle J (2019) A multi-sensor next-best-view framework for geometric model-based robotics applications. In: 2019 International conference on robotics and automation (ICRA), IEEE, pp 8769–8775
Das D, Lee CG (2019) A two-stage approach to few-shot learning for image recognition. IEEE Trans Image Process 29:3336–3350
Article Google Scholar
Doumanoglou A, Kouskouridas R, Malassiotis S, Kim TK (2016) Recovering 6d object pose and predicting next-best-view in the crowd. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3583–3592
Edmonds M, Yigit T, Yi J (2020) Auto-calibrated 3d hyperspectral scanning using a heterogeneous set of cameras and lights with spectrally-optimal next-best-view planning. In: 2020 IEEE 16th International conference on automation science and engineering (CASE), pp 863–868. IEEE
Gao P, Yuan R, Wang F, Xiao L, Fujita H, Zhang Y (2020) Siamese attentional keypoint network for high performance visual tracking. Knowl Based Syst 193:105448
Article Google Scholar
Gao P, Zhang Q, Wang F, Xiao L, Fujita H, Zhang Y (2020) Learning reinforced attentional representation for end-to-end visual tracking. Inform Sci 517:52–67
Article Google Scholar
Gonzalez RC, Richard E (2018) Woods digital image processing, Pearson Prentice Hall
Hayashi T, Fujita H (2020) Cluster-based zero-shot learning for multivariate data. J Ambient Intell Humaniz Comput 12:1–15
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hoseini P, Blankenburg J, Nicolescu M, Nicolescu M, Feil-Seifer D (2019) Active eye-in-hand data management to improve the robotic object detection performance. Computers 8(4):71
Article Google Scholar
Hoseini P, Blankenburg J, Nicolescu M, Nicolescu M, Feil-Seifer D (2019) An active robotic vision system with a pair of moving and stationary cameras. In: International symposium on visual computing, Springer, pp 184–195
Jia Z, Chang YJ, Chen T (2010) A general boosting-based framework for active object recognition. In: British machine vision conference (BMVC), Citeseer, pp 1–11
Lauri M, Pajarinen J, Peters J, Frintrop S (2020) Multi-sensor next-best-view planning as matroid-constrained submodular maximization. IEEE Robot Autom Lett 5(4):5323–5330
Article Google Scholar
Lehnert C, Tsai D, Eriksson A, McCool C (2019) 3d move to see: Multi-perspective visual servoing towards the next best view within unstructured and occluded environments. In: 2019 IEEE/RSJ International conference on intelligent robots and systems (IROS), IEEE, pp 3890–3897
Morrison D, Corke P, Leitner J (2019) Multi-view picking: Next-best-view reaching for improved grasping in clutter. In: 2019 International conference on robotics and automation (ICRA), IEEE, pp 8762–8768
Palomeras N, Hurtós N, Vidal E, Carreras M (2019) Autonomous exploration of complex underwater environments using a probabilistic next-best-view planner. IEEE Robot Autom Lett 4(2):1619–1625
Article Google Scholar
Pérez-Hernández F, Tabik S, Lamas A, Olmos R, Fujita H, Herrera F (2020) Object detection binary classifiers methodology based on deep learning to identify small objects handled similarly: Application in video surveillance. Knowl Based Syst 194 :105590
Article Google Scholar
Potthast C, Sukhatme GS (2014) A probabilistic framework for next best view estimation in a cluttered environment. J Vis Commun Image Represent 25(1):148–164
Article Google Scholar
Rebull Mestres J (2017) Implementation of an automated eye-in hand scanning system using best-path planning, Master’s thesis, Universitat Politècnica de Catalunya
Wang Z, Xiong J, Yang Y, Li H (2017) A flexible and robust threshold selection method. IEEE Trans Circuits Syst Video Technol 28(9):2220–2232
Article Google Scholar
Wu Y, Jiang X, Fang Z, Gao Y, Fujita H (2021) Multi-modal 3d object detection by 2d-guided precision anchor proposal and multi-layer fusion. Appl Soft Comput 108:107405
Article Google Scholar
Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J (2015) 3d shapenets: A deep representation for volumetric shapes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1912–1920
Xu Y, Hu J, Wattanachote K, Zeng K, Gong Y (2020) Sketch-based shape retrieval via best view selection and a cross-domain similarity measure. IEEE Trans Multimed 22(11):2950–2962
Google Scholar
Zeng R, Zhao W, Liu YJ (2020) Pc-nbv: A point cloud based deep network for efficient next best view planning. In: 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE, pp 7050–7057
Zhu K, Jiang X, Fang Z, Gao Y, Fujita H, Hwang JN (2021) Photometric transfer for direct visual odometry. Knowl Based Syst 213:106671
Article Google Scholar

Download references

Funding

This work has been supported in part by the Office of Naval Research award N00014-16-1-2312 and US Army Research Laboratory (ARO) award W911NF-20-2-0084.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Nevada, Reno, Reno, NV, USA
Pourya Hoseini, Shuvo Kumar Paul, Mircea Nicolescu & Monica Nicolescu

Authors

Pourya Hoseini
View author publications
You can also search for this author in PubMed Google Scholar
Shuvo Kumar Paul
View author publications
You can also search for this author in PubMed Google Scholar
Mircea Nicolescu
View author publications
You can also search for this author in PubMed Google Scholar
Monica Nicolescu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: Pourya Hoseini, Mircea Nicolescu, Monica Nicolescu; Methodology: Pourya Hoseini, Mircea Nicolescu, Shuvo Kumar Paul; Formal Ananlysis and Investigation: Pourya Hoseini, Mircea Nicolescu, Monica Nicolescu; Writing - original draft preparation: Pourya Hoseini; Writing - review and editing: Pourya Hoseini, Shuvo Kumar Paul, Mircea Nicolescu; Funding acquisition: Monica Nicolescu, Mircea Nicolescu; Resources: Monica Nicolescu, Mircea Nicolescu, Pourya Hoseini, Shuvo Kumar Paul; Supervision: Mircea Nicolescu, Monica Nicolescu.

Corresponding author

Correspondence to Pourya Hoseini.

Additional information

Availability of data and material

Yes. Dataset available at https://github.com/pouryahoseini/Next-Best-View-Dataset https://github.com/pouryahoseini/Next-Best-View-Dataset.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hoseini, P., Paul, S.K., Nicolescu, M. et al. A one-shot next best view system for active object recognition. Appl Intell 52, 5290–5309 (2022). https://doi.org/10.1007/s10489-021-02657-z

Download citation

Accepted: 29 June 2021
Published: 09 August 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s10489-021-02657-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A one-shot next best view system for active object recognition

Abstract

Access this article

Similar content being viewed by others

Next Best View Planning in a Single Glance: An Approach to Improve Object Recognition

View planning in robot active vision: A survey of systems, algorithms, and applications

Active Scene Classification via Dynamically Learning Prototypical Views

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Availability of data and material

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A one-shot next best view system for active object recognition

Abstract

Access this article

Similar content being viewed by others

Next Best View Planning in a Single Glance: An Approach to Improve Object Recognition

View planning in robot active vision: A survey of systems, algorithms, and applications

Active Scene Classification via Dynamically Learning Prototypical Views

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Additional information

Availability of data and material

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation