Abstract
Structural scene understanding is an interconnected process wherein modules for object detection and supporting structure detection need to co-operate in order to extract cross-correlated information, thereby utilizing the maximum possible information rendered by the scene data. Such an inter-linked framework provides a holistic approach to scene understanding, while obtaining the best possible detection rates. Motivated by recent research in coherent geometrical contextual reasoning and object recognition, this paper proposes a unified framework for robust 3D supporting plane estimation using a joint probabilistic model which uses results from object shape detection and 3D plane estimation. Maximization of the joint probabilistic model leads to robust 3D surface estimation while reducing false perceptual grouping. We present results on both synthetic and real data obtained from an indoor mobile robot to demonstrate the benefits of our unified detection framework.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bao, S.Y.-Z., Sun, M., Savarese, S.: Toward coherent object detection and scene layout understanding. In: CVPR, pp. 65–72 (2010)
Sun, M., Bao, S.Y.-Z., Savarese, S.: Object detection with geometrical context feedback loop. In: BMVC, pp. 1–11 (2010)
Li, L.J., Fei-Fei, L.: What, where and who? classifying events by scene and object recognition. In: IEEE 11th International Conference on Computer Vision, ICCV 2007, pp. 1–8 (2007)
Saxena, A., Chung, S., Ng, A.: 3-d depth reconstruction from a single still image. International Journal of Computer Vision 76, 53–69 (2008)
Hoiem, D., Efros, A., Hebert, M.: Recovering surface layout from an image. International Journal of Computer Vision 75, 151–172 (2007)
Cornelis, N., Leibe, B., Cornelis, K., Van Gool, L.: 3d urban scene modeling integrating recognition and reconstruction. International Journal of Computer Vision 78, 121–141 (2008)
Hoiem, D., Efros, A., Hebert, M.: Putting objects in perspective. In: CVPR 2006, vol. 2, pp. 2137–2144 (2006)
Hoiem, D., Rother, C., Winn, J.: 3d layoutcrf for multi-view object class recognition and segmentation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)
Roberts, L.G.: Machine Perception of 3D Solids. PhD thesis, Dept. of Electrical Engineering, Massachusetts Institute of Technology (1963)
Nedovic, V., Smeulders, A., Redert, A., Geusebroek, J.M.: Stages as models of scene geometry. IEEE Transactions on Pattern Analysis and Machine Intelligence 32, 1673–1687 (2010)
Helmer, S., Lowe, D.: Using stereo for object recognition. In: IEEE International Conference on Robotics and Automation, ICRA 2010, pp. 3121–3127 (2010)
Congcong, L., Adarsh, K., Ashutosh, S., Tsuhan, C.: Towards holistic scene understanding: Feedback enabled cascaded classification models. In: Twenty-Fourth Annual Conference on Neural Information Processing Systems (2010)
Gavrila, D.M., Munder, S.: Multi-cue pedestrian detection and tracking from a moving vehicle. Int. J. Comput. Vision 73, 41–59 (2007)
Zuliani, M., Kenney, C.S., Manjunath, B.S.: The multiransac algorithm and its application to detect planar homographies. In: IEEE International Conference on Image Processing (2005)
Toldo, R., Fusiello, A.: Robust multiple structures estimation with J-linkage. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 537–547. Springer, Heidelberg (2008)
Zhang, W., Kosecka, J.: Nonparametric estimation of multiple structures with outliers. In: Dynamical Vision Workshop (2006)
Chin, T., Wang, H., Suter, D.: Robust fitting of multiple structures: The statistical learning approach. In: ICCV (2009)
Delong, A., Osokin, A., Isack, H.N., Boykov, Y.: Fast approximate energy minimization with label costs. In: CVPR (2010)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24, 381–395 (1981)
Stewart, C.V.: Bias in robust estimation caused by discontinuities and multiple structures. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 818–833 (1997)
Gallo, O., Manduchi, R., Rafii, A.: CC-RANSAC: Fitting planes in the presence of multiple surfaces in range data. Pattern Recogn. Lett. 32, 403–410 (2011)
Biederman, I.: Recognition-by-components: A theory of human image understanding. Psychological Review 94, 115–147 (1987)
Dickinson, S., Bergevin, R., Biederman, I., Eklund, J., Munck-Fairwood, R., Jain, A.K., Pentland, A.: Panel report: The potential of geons for generic 3-d object recognition. Image and Vision Computing 15, 277–292 (1997)
Tang, X., Okada, K., Malsburg, C.v.d.: Represent and Detect Geons by Joint Statistics of Steerable Pyramid Decomposition. Technical Report 02-759, Computer Science Department, University of Southern California (2002)
Lowe, D.G.: Three-dimensional object recognition from single two-dimensional images. Artificial Intelligence 31, 355–395 (1987)
Nelson, R.C., Selinger, A.: A cubist approach to object recognition. Technical Report TR689, Dept. of Computer Science, Univ. of Rochester (1998)
Ferrari, V., Fevrier, L., Jurie, F., Schmid, C.: Groups of adjacent contour segments for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 30, 36–51 (2008)
Ommer, B., Malik, J.: Multi-Scale Object Detection by Clustering Lines. In: International Conference on Computer Vision (2009)
Dickinson, S., Metaxas, D.: Integrating Qualitative and Quantitative Object Representations in the Recovery and Tracking of 3-D Shape. In: Harris, L., Jenkin, M. (eds.) Computational and Psychophysical Mechanisms of Visual Coding, pp. 221–248. Cambridge University Press, New York (1997)
Sala, P., Dickinson, S.: Model-Based Perceptual Grouping and Shape Abstraction. In: The Sixth IEEE Computer Society Workshop on Perceptual Grouping in Computer Vision, POCV (2008)
Sarkar, S., Boyer, K.L.: A computational structure for preattentive perceptual organization: Graphical enumeration and voting methods. IEEE Transactions on System, Man and Cybernetics 24, 246–266 (1994)
Zillich, M., Vincze, M.: Anytimeness Avoids Parameters in Detecting Closed Convex Polygons. In: The Sixth IEEE Computer Society Workshop on Perceptual Grouping in Computer Vision, POCV (2008)
Richtsfeld, A., Vincze, M.: Basic object shape detection and tracking using perceptual organization. In: International Conference on Advanced Robotics, ICAR, pp. 1–6 (2009)
Wang, Z., Wu, F., Hu, Z.: Msld: A robust descriptor for line matching. Pattern Recognition 42, 941–953 (2009)
Yasovardhan, R., Hemanth, K., Krishna, K.: Estimating ground and other planes from a single tilted laser range finder for on-road driving. In: International Conference on Advanced Robotics, ICAR 2009, pp. 1–6 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhou, K., Richtsfeld, A., Varadarajan, K.M., Zillich, M., Vincze, M. (2011). Combining Plane Estimation with Shape Detection for Holistic Scene Understanding. In: Blanc-Talon, J., Kleihorst, R., Philips, W., Popescu, D., Scheunders, P. (eds) Advanced Concepts for Intelligent Vision Systems. ACIVS 2011. Lecture Notes in Computer Science, vol 6915. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23687-7_66
Download citation
DOI: https://doi.org/10.1007/978-3-642-23687-7_66
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23686-0
Online ISBN: 978-3-642-23687-7
eBook Packages: Computer ScienceComputer Science (R0)