Skip to main content
Log in

Improving object orientation estimates by considering multiple viewpoints

Orientation histograms of symmetries and measurement models for view selection

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

This article describes a probabilistic approach for improving the accuracy of general object pose estimation algorithms. We propose a histogram filter variant that uses the exploration capabilities of robots, and supports active perception through a next-best-view proposal algorithm. For the histogram-based fusion method we focus on the orientation of the 6 degrees of freedom (DoF) pose, since the position can be processed with common filtering techniques. The detected orientations of the object, estimated with a pose estimator, are used to update the hypothesis of its actual orientation. We discuss the design of experiments to estimate the error model of a detection method, and describe a suitable representation of the orientation histograms. This allows us to consider priors about likely object poses or symmetries, and use information gain measures for view selection. The method is validated and compared to alternatives, based on the outputs of different 6 DoF pose estimators, using real-world depth images acquired using different sensors, and on a large synthetic dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. http://docs.pointclouds.org/1.7.0/group__visualization.html.

  2. In our experiments the prior was always uniformly distributed, but it can be easily modified to any distribution the discretization allows to represent (which is much more flexible than a parametric model).

  3. We do not apply an additional uncertainty to the estimates after a viewpoint change, as we were using very precise sensor movements to obtain ground truth.

  4. https://github.com/PointCloudLibrary/pcl/tree/master/simulation.

  5. Please note that the box plot drawing library detects outliers based on interquartile range distance, marks them as dots, and excludes them from percentile calculations.

References

  • Aldoma, A., Marton, Z. C., Tombari, F., Wohlkinger, W., Potthast, C., Zeisl, B., et al. (2012). Tutorial: Point cloud library—three-dimensional object recognition and 6 DoF pose estimation. IEEE Robotics & Automation Magazine, 19(3), 80–91.

    Article  Google Scholar 

  • Arbel, T., & Ferrie, F. P. (2001). Entropy-based gaze planning. Image and Vision Computing, 19(11), 779–786.

    Article  Google Scholar 

  • Barequet, G., & Sharir, M. (1994). Partial surface and volume matching in three dimensions. IEEE Transactions PAMI, 19, 929–948.

    Article  Google Scholar 

  • Barequet, G., & Sharir, M. (1999). Partial surface matching by using directed footprints. Computational Geometry, 12, 45–62.

    Article  MathSciNet  MATH  Google Scholar 

  • Bingham, C. (1974). An antipodally symmetric distribution on the sphere. The Annals of Statistics, 2(6), 1201–1225. doi:10.1214/aos/1176342874.

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, H., & Bhanu, B. (2007). 3d free-form object recognition in range images using local surface patches. Pattern Recognition Letters, 28(10), 1252–1262. doi:10.1016/j.patrec.2007.02.009.

    Article  Google Scholar 

  • Chen, S., Li, Y., & Kwok, N. M. (2011). Active vision in robotic systems: A survey of recent developments. IJRR, 30(11), 1343–1377.

    Google Scholar 

  • Chen, Y., & Medioni, G. (1992). Object modelling by registration of multiple range images. Image and vision computing, 10(3), 145–155. doi:10.1016/0262-8856(92)90066-C.

    Article  Google Scholar 

  • Denzler, J., & Brown, C. M. (2002). Information theoretic sensor data selection for active object recognition and state estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(2), 145–157.

    Article  Google Scholar 

  • Doya, K., Ishii, S., Pouget, A., & Rao, R. P. N. (2007). Bayesian brain: Probabilistic approaches to neural coding. Cambridge: The MIT Press.

    MATH  Google Scholar 

  • Drost, B., Ulrich, M., Navab, N., & Ilic, S. (2010). Model globally, match locally: Efficient and robust 3d object recognition. In IEEE conference on computer vision and pattern recognition (CVPR), 2010, pp. 998–1005. doi:10.1109/CVPR.2010.5540108

  • Eidenberger, R., Grundmann, T., Feiten, W., & Zoellner, R. (2008). Fast parametric viewpoint estimation for active object detection. In IEEE international conference on multisensor fusion and integration for intelligent systems, 2008 (pp. 309–314). IEEE.

  • Eidenberger, R., Grundmann, T., Schneider, M., Feiten, W., Fiegert, M., Wichert, G.v., et al. (2012). Scene analysis for service robots. In Towards service robots for everyday environments (pp. 181–213). Springer. http://www.springerlink.com/index/N9212641455202J2.pdf

  • Eidenberger, R., Grundmann, T., & Zoellner, R. (2009). Probabilistic action planning for active scene modeling in continuous high-dimensional domains. In IEEE international conference on robotics and automation, 2009 (pp. 2412–2417). Ieee. doi:10.1109/ROBOT.2009.5152598. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5152598

  • Fitzgibbon, A. (2001). Robust registration of 2D and 3D point sets. In: Proceedings of the British Machine Vision Conference (pp. 662–670). Manchester, UK.

  • Gao, T., & Koller, D. (2011). Active classification based on value of classifier. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira & K. Q. Weinberger (Eds.), Advances in neural information processing systems 24 (NIPS) (pp. 1062–1070). Curran Associates, Inc. http://papers.nips.cc/paper/4340-active-classification-based-on-value-of-classifier.pdf.

  • Glover, J., Rusu, R., & Bradski, G. (2011). Monte Carlo pose estimation with quaternion kernels and the bingham distribution. In Proceedings of robotics: Science and systems. Los Angeles, CA, USA.

  • Grzyb, B. J., Castelló, V., & del Pobil, A. P. (2012). Reachable by walking: Inappropriate integration of near and far space may lead to distance errors. In: Szufnarowska, J. (Ed.), Proceedings of the post-graduate conference on robotics and development of cognition (pp. 12–15).

  • Hough, P. (1962). Method and means for recognizing complex patterns. U.S. Patent 3.069.654.

  • Kasper, A., Xue, Z., & Dillmann, R. (2012). The KIT object models database: An object model database for object recognition, localization and manipulation in service robotics. IJRR, 31(8), 927–934.

    Google Scholar 

  • Kriegel, S., Brucker, M., Marton, Z. C., Bodenmüller, T., & Suppa, M. (2013). Combining object modeling and recognition for active scene exploration. In IEEE international conference on intelligent robots and systems (IROS), Tokyo, Japan.

  • Kriegel, S., Rink, C., Bodenmüller, T., & Suppa, M. (2015). Efficient next-best-scan planning for autonomous 3d surface reconstruction of unknown objects. JRTIP, 10(4), 611–631.

    Google Scholar 

  • Laporte, C., & Arbel, T. (2006). Efficient discriminant viewpoint selection for active bayesian recognition. International Journal of Computer Vision, 68(3), 267–287. doi:10.1007/s11263-005-4436-9.

    Article  Google Scholar 

  • Lynott, D., & Connell, L. (2009). Modality exclusivity norms for 423 object properties. Behavior Research Methods, 41(2), 558–564.

    Article  Google Scholar 

  • Marton, Z. C., Pangercic, D., Blodow, N., & Beetz, M. (2011). Combined 2D–3D categorization and classification for multimodal perception systems. The International Journal of Robotics Research, 30(11), 1378–1402.

    Article  Google Scholar 

  • Marton, Z. C., Seidel, F., Balint-Benczedi, F., & Beetz, M. (2012). Ensembles of strong learners for multi-cue classification. Pattern Recognition Letters (PRL), Special Isssue on Scene Understandings and Behaviours Analysis.

  • Mian, A., Bennamoun, M., & Owens, R. (2010). On the repeatability and quality of keypoints for local feature-based 3d object retrieval from cluttered scenes. International Journal of Computer Vision, 89(2–3), 348–361. doi:10.1007/s11263-009-0296-z.

    Article  Google Scholar 

  • Mian, A. S., Bennamoun, M., & Owens, R. (2006). Three-dimensional model-based object recognition and segmentation in cluttered scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1584–1601. doi:10.1109/TPAMI.2006.213.

  • Novatnack, J., & Nishino, K. (2008). Scale-dependent/invariant local 3d shape descriptors for fully automatic registration of multiple sets of range images. ECCV, 3, 440–453.

    Google Scholar 

  • Papazov, C., Haddadin, S., Parusel, S., Krieger, K., & Burschka, D. (2012). Rigid 3d geometry matching for grasping of known objects in cluttered scenes. The International Journal of Robotics Research, 31(4), 538–553.

    Article  Google Scholar 

  • Pronobis, A., Mozos, O. M., Caputo, B., & Jensfelt, P. (2010). Multi-modal semantic place classification. The International Journal of Robotics Research (IJRR), 29(2–3), 298–320. doi:10.1177/0278364909356483.

  • Riedel, S., Marton, Z. C., & Kriegel, S. (2016). Multi-view orientation estimation using Bingham mixture models. In: 2016 IEEE international conference on automation, quality and testing, robotics (AQTR). doi:10.1109/AQTR.2016.7501381

  • Rink, C., Kriegel, S., Seth, D., Denninger, M., Marton, Z. C., & Bodenmüller, T. (2016). Monte Carlo registration and its application with autonomous robots. Journal of Sensors. doi:10.1155/2016/2546819

  • Rink, C., Marton, Z. C., Seth, D., Bodenmüller, T., & Suppa, M. (2013). Feature based particle filter registration of 3D surface models and its application in robotics. In IEEE international conference on intelligent robots and systems (IROS), Tokyo, Japan.

  • Romea, A., & Srinivasa, S. (2010). Efficient multi-view object recognition and full pose estimation. In 2010 IEEE international conference on robotics and automation (ICRA 2010).

  • Roy, S. D., Chaudhury, S., & Banerjee, S. (2004). Active recognition through next view planning: A survey. Pattern Recognition, 37(3), 429–446.

    Article  Google Scholar 

  • Rusu, R., Blodow, N., & Beetz, M. (2009). Fast point feature histograms (fpfh) for 3d registration. In IEEE international conference on robotics and automation, 2009. ICRA ’09 (pp. 3212–3217). doi:10.1109/ROBOT.2009.5152473

  • Rusu, R., Blodow, N., Marton, Z., & Beetz, M. (2008). Aligning point cloud views using persistent feature histograms. In IEEE/RSJ international conference on intelligent robots and systems, 2008. IROS 2008 (pp. 3384–3391). doi:10.1109/IROS.2008.4650967

  • Selinger, A., & Nelson, R. (2001). Appearance-based object recognition using multiple views. In IN CVPR01 (pp. 905–911).

  • Sharf, I., Wolf, A., & Rubin, M. (2010). Arithmetic and geometric solutions for average rigid-body rotation. Mechanism and Machine Theory, 45(9), 1239–1251. doi:10.1016/j.mechmachtheory.2010.05.002. http://www.sciencedirect.com/science/article/pii/S0094114X10000790.

  • Sipe, M. A., & Casasent, D. (2002). Feature space trajectory methods for active computer vision. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(12), 1634–1643. doi:10.1109/TPAMI.2002.1114854.

    Article  Google Scholar 

  • Thrun, S., Burgard, W., & Fox, D. (2005). Probabilistic robotics. Cambridge, MA: MIT Press.

    MATH  Google Scholar 

  • Tombari, F., & Di Stefano, F. (2012). Hough voting for 3d object recognition under occlusion and clutter. IPSJ Transactions on Computer Vision and Applications, 4, 20–29.

    Article  Google Scholar 

  • Tombari, F., Salti, S., & Luigi, D. (2010). Unique signatures of histograms for local surface description. In Proceedings of the 11th European conference on computer vision conference on computer vision: Part III, ECCV’10 (pp. 356–369). Springer, Berlin. http://dl.acm.org/citation.cfm?id=1927006.1927035

  • Vikstén, F., Söderberg, R., Nordberg, K., & Perwass, C. (2006). Increasing pose estimation performance using multi-cue integration. In ICRA (pp. 3760–3767). IEEE. http://dblp.uni-trier.de/db/conf/icra/icra2006.html#VikstenSNP06

  • Voit, M., & Stiefelhagen, R. (2006). A bayesian approach for multi-view head pose estimation. In IEEE international conference on multisensor fusion and integration for intelligent systems - MFI06. Heidelberg, Germany.

  • Wahl, E., Hillenbrand, U., & Hirzinger, G. (2003). Surflet-pair-relation histograms: A statistical 3D-shape representation for rapid classification. In 3DIM (pp. 474–481).

Download references

Acknowledgements

The authors would like to thank Anas Al-Nuaimi for helpful discussions, and the help of Laura Beckmann with the real-world ground truth for the evaluation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zoltán Csaba Márton.

Additional information

This is one of several papers published in Autonomous Robots comprising the Special Issue on Active Perception.

This work has partly been supported by the EC, under contract number H2020-ICT-645403-ROBDREAM.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Márton, Z.C., Türker, S., Rink, C. et al. Improving object orientation estimates by considering multiple viewpoints. Auton Robot 42, 423–442 (2018). https://doi.org/10.1007/s10514-017-9633-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-017-9633-1

Keywords

Navigation