Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes

Hinterstoisser, Stefan; Lepetit, Vincent; Ilic, Slobodan; Holzer, Stefan; Bradski, Gary; Konolige, Kurt; Navab, Nassir

doi:10.1007/978-3-642-37331-2_42

Stefan Hinterstoisser²⁰,
Vincent Lepetit²²,
Slobodan Ilic²⁰,
Stefan Holzer²⁰,
Gary Bradski²¹,
Kurt Konolige²¹ &
…
Nassir Navab²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7724))

Included in the following conference series:

Asian Conference on Computer Vision

10k Accesses
236 Citations

Abstract

We propose a framework for automatic modeling, detection, and tracking of 3D objects with a Kinect. The detection part is mainly based on the recent template-based LINEMOD approach [1] for object detection. We show how to build the templates automatically from 3D models, and how to estimate the 6 degrees-of-freedom pose accurately and in real-time. The pose estimation and the color information allow us to check the detection hypotheses and improves the correct detection rate by 13% with respect to the original LINEMOD. These many improvements make our framework suitable for object manipulation in Robotics applications. Moreover we propose a new dataset made of 15 registered, 1100+ frame video sequences of 15 various objects for the evaluation of future competing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hinterstoisser, S., Cagniart, C., Holzer, S., Ilic, S., Konolige, K., Navab, N., Lepetit, V.: Multimodal Templates for Real-Time Detection of Texture-Less Objects in Heavily Cluttered Scenes. In: ICCV (2011)
Google Scholar
Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: Real-Time Dense Surface Mapping and Tracking. In: ISMAR (2011)
Google Scholar
Pan, Q., Reitmayr, G., Drummond, T.: ProFORMA: Probabilistic Feature-based On-line Rapid Model Acquisition. In: BMVC (2009)
Google Scholar
Weise, T., Wismer, T., Leibe, B., Gool, L.V.: In-hand Scanning with Online Loop Closure. In: International Workshop on 3-D Digital Imaging and Modeling (2009)
Google Scholar
Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: DTAM: Dense Tracking and Mapping in Real-Time. In: ICCV (2011)
Google Scholar
Viola, P., Jones, M.: Fast Multi-View Face Detection. In: CVPR (2003)
Google Scholar
Stark, M., Goesele, M., Schiele, B.: Back to the Future: Learning Shape Models from 3D Cad Data. In: BMVC (2010)
Google Scholar
Liebelt, J., Schmid, C.: Multi-View Object Class Detection With a 3D Geometric Model. In: CVPR (2010)
Google Scholar
Ferrari, V., Jurie, F., Schmid, C.: From Images to Shape Models for Object Detection. In: IJCV (2009)
Google Scholar
Payet, N., Todorovic, S.: From contours to 3d object detection and pose estimation. In: ICCV, pp. 983–990 (2011)
Google Scholar
Gavrila, D., Philomin, V.: Real-Time Object Detection for “smart” Vehicles. In: ICCV (1999)
Google Scholar
Huttenlocher, D., Klanderman, G., Rucklidge, W.: Comparing Images Using the Hausdorff Distance. TPAMI (1993)
Google Scholar
Steger, C.: Similarity Measures for Occlusion, Clutter, and Illumination Invariant Object Recognition. In: Radig, B., Florczyk, S. (eds.) DAGM 2001. LNCS, vol. 2191, pp. 148–154. Springer, Heidelberg (2001)
Chapter Google Scholar
Hinterstoisser, S., Lepetit, V., Ilic, S., Fua, P., Navab, N.: Dominant Orientation Templates for Real-Time Detection of Texture-Less Objects. In: CVPR (2010)
Google Scholar
Mian, A.S., Bennamoun, M., Owens, R.A.: Automatic Correspondence for 3D Modeling: an Extensive Review. International Journal of Shape Modeling (2005)
Google Scholar
Zhang, Z.: Iterative Point Matching for Registration of Free-Form Curves. In: IJCV (1994)
Google Scholar
Johnson, A.E., Hebert, M.: Using Spin Images for Efficient Object Recognition in Cluttered 3 D Scenes. TPAMI (1999)
Google Scholar
Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model Globally, Match Locally: Efficient and Robust 3D Object Recognition. In: CVPR (2010)
Google Scholar
Mian, A.S., Bennamoun, M., Owens, R.: Three-Dimensional Model-Based Object Recognition and Segmentation in Cluttered Scenes. TPAMI (2006)
Google Scholar
Rusu, R.B., Blodow, N., Beetz, M.: Fast Point Feature Histograms (FPFH) for 3D Registration. In: International Conference on Robotics and Automation (2009)
Google Scholar
Tombari, F., Salti, S., Di Stefano, L.: Unique Signatures of Histograms for Local Surface Description. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part III. LNCS, vol. 6313, pp. 356–369. Springer, Heidelberg (2010)
Chapter Google Scholar
Sun, M., Bradski, G., Xu, B.-X., Savarese, S.: Depth-Encoded Hough Voting for Joint Object Detection and Shape Recovery. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 658–671. Springer, Heidelberg (2010)
Chapter Google Scholar
Lai, K., Bo, L., Ren, X., Fox, D.: Sparse distance learning for object recognition combining rgb and depth information. In: ICRA, pp. 4007–4013 (2011)
Google Scholar
Grabner, M., Grabner, H., Bischof, H.: Learning Features for Tracking. In: CVPR (2007)
Google Scholar
Ozuysal, M., Calonder, M., Lepetit, V., Fua, P.: Fast Keypoint Online Learning and Recognition. TPAMI (2010)
Google Scholar
Kalal, Z., Matas, J., Mikolajczyk, K.: P-N Learning: Bootstrapping Binary Classifiers by Structural Constraints. In: CVPR (2010)
Google Scholar
Hinterstoisser, S., Benhimane, S., Lepetit, V., Fua, P., Navab, N.: Simultaneous Recognition and Homography Extraction of Local Patches With a Simple Linear Classifier. In: BMVC (2008)
Google Scholar
Fitzgibbon, A.: Robust Registration fo 2D and 3D Point Sets. In: BMVC (2001)
Google Scholar
Hinterstoisser, S., Ilic, S., Sturm, P., Navab, N., Fua, P., Lepetit, V.: Gradient Response Maps for Real-Time Detection of Texture-Less Objects. TPAMI (2012)
Google Scholar
Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. In: CVPR (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

CAMP, Technische Universität München (TUM), Germany
Stefan Hinterstoisser, Slobodan Ilic, Stefan Holzer & Nassir Navab
Industrial Perception, Palo Alto, CA, USA
Gary Bradski & Kurt Konolige
CV-Lab, École Polytechnique Fédérale de Lausanne (EPFL), Switzerland
Vincent Lepetit

Authors

Stefan Hinterstoisser
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Lepetit
View author publications
You can also search for this author in PubMed Google Scholar
Slobodan Ilic
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Holzer
View author publications
You can also search for this author in PubMed Google Scholar
Gary Bradski
View author publications
You can also search for this author in PubMed Google Scholar
Kurt Konolige
View author publications
You can also search for this author in PubMed Google Scholar
Nassir Navab
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, Seoul National University, 1 Gwanak-ro, 151-744, Gwanak-gu, Seoul, Korea
Kyoung Mu Lee
Microsoft Research Asia, No. 5, Danling st., Haidian district, 100080, Beijing, P.R. China
Yasuyuki Matsushita
School of Interactive Computing, Georgia Institute of Technology, 801 Atlantic Drive, CCB 315, 30332, Atlanta, GA, USA
James M. Rehg
Institute of Automation, National Laboratory of Pattern Recognition, Chinese Academy of Sciences, Zhong Quan Cun East Road 95, Haidian District, 100 190, Beijing, P.R. China
Zhanyi Hu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hinterstoisser, S. et al. (2013). Model Based Training, Detection and Pose Estimation of Texture-Less 3D Objects in Heavily Cluttered Scenes. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37331-2_42

Download citation

DOI: https://doi.org/10.1007/978-3-642-37331-2_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37330-5
Online ISBN: 978-3-642-37331-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics