Large-scale gaussian process multi-class classification for semantic segmentation and facade recognition

Fröhlich, Björn; Rodner, Erik; Kemmler, Michael; Denzler, Joachim

doi:10.1007/s00138-012-0480-y

Large-scale gaussian process multi-class classification for semantic segmentation and facade recognition

Original Paper
Published: 04 January 2013

Volume 24, pages 1043–1053, (2013)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Björn Fröhlich¹,
Erik Rodner^1,2,
Michael Kemmler¹ &
…
Joachim Denzler¹

642 Accesses
9 Citations
3 Altmetric
Explore all metrics

Abstract

This paper deals with the task of semantic segmentation, which aims to provide a complete description of an image by inferring a pixelwise labeling. While pixelwise classification is a suitable approach to achieve this goal, state-of-the-art kernel methods are generally not applicable since training and testing phase involve large amounts of data. We address this problem by presenting a method for large-scale inference with Gaussian processes. Standard limitations of Gaussian process classifiers in terms of speed and memory are overcome by pre-clustering the data using decision trees. This leads to a breakdown of the entire problem into several independent classification tasks whose complexity is controlled by the maximum number of training examples allowed in the tree leaves. We additionally propose a technique which allows for computing multi-class probabilities by incorporating uncertainties of the classifier estimates. The approach provides pixelwise semantics for a wide range of applications and different image types such as those from scene understanding, defect localization, and remote sensing. Our experiments are performed with a facade recognition application that shows the significant performance gain achieved by our method compared to previous approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Microsoft COCO: Common Objects in Context

Deep learning models for digital image processing: a review

Article 07 January 2024

R. Archana & P. S. Eliahim Jeevaraj

A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets

Article 09 February 2021

Himanshu Mittal, Avinash Chandra Pandey, … Garv Modwel

References

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article MATH Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman and Hall, London (1984)
MATH Google Scholar
Broderick, T., Gramacy, R.B.: Treed gaussian process models for classification. In: Classification as a Tool for Research, Studies in Classification, Data Analysis and Knowledge Organization, pp. 101–108 (2010)
Candela, Q.J., Rasmussen, C.E.: A unifying view of sparse approximate gaussian process regression. J. Mach. Learn. Res. 6, 1939–1959 (2005)
MathSciNet MATH Google Scholar
Chang, F., Guo, C.Y., Lin, X.R., Lu, C.J.: Tree decomposition for large-scale SVM problems. J. Mach. Learn. Res. 11, 2935–2972 (2010)
MathSciNet MATH Google Scholar
Chen, C., Freedman, D., Lampert, C.: Enforcing topological constraints in random field image segmentation. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11) (2011)
Chen, T., Ren, J.: Bagging for gaussian process regression. Neurocomputing 72(7–9), 1605–1610 (2009)
Article Google Scholar
Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)
Article Google Scholar
Csurka, G., Perronnin, F.: An efficient approach to semantic segmentation. IJCV 95(2), 198–212 (2011)
Article MathSciNet Google Scholar
Domke, J.: Crossover random fields. J. Mach. Learn. Res. (2009)
Dumont, M., Marée, R., Wehenkel, L., Geurts, P.: Fast multi-class image annotation with random subwindows and multiple output randomized trees. In: Proceedings of the 4th International Conference on Computer Vision, Theory and Applications (VISAPP), vol. 2, pp. 196–203 (2009)
Fröhlich, B., Rodner, E., Denzler, J.: A fast approach for pixelwise labeling of facade images. In: Proceedings of the International Conference on Pattern Recognition (ICPR’10), pp. 3029–3032 (2010)
Fröhlich, B., Rodner, E., Kemmler, M., Denzler, J.: Efficient gaussian process classification using random decision forests. Pattern Recogn. Image Anal. 21, 184–187 (2011)
Article Google Scholar
Gool, L.J.V., Zeng, G., den Borre, F.V., Müller, P.: Towards mass-produced building models. In: Photogrammetric Image Analysis, pp. 209–220 (2007)
Gould, S., Rodgers, J., Cohen, D., Elidan, G., Koller, D.: Multi-class segmentation with relative location prior. Int. J. Comput. Vis. 80(3), 300–316 (2008). doi:10.1007/s11263-008-0140-x
Google Scholar
Huang, Q.X., Han, M., Wu, B., Ioffe, S.: A hierarchical conditional random field model for labeling and segmenting images of street scenes. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1953–1960. IEEE, New York (2011)
Iba, W., Langley, P.: Induction of one-level decision trees. In: Proceedings of the International Conference of Machine Learning (ICML’92) (1992)
Kapoor, A., Grauman, K., Urtasun, R., Darrell, T.: Gaussian processes for object categorization. Int. J. Comput. Vis. 88(2), 169–188 (2010)
Article Google Scholar
Kohli, P., Ladicky, L., Torr, P.: Robust higher order potentials for enforcing label consistency. In: IEEE Conference on Computer Vision and Pattern Recognition, 2008. CVPR 2008, pp. 1–8 (2008). doi:10.1109/CVPR.2008.4587417
Korč, F., Förstner, W.: etrims image database for interpreting images of man-made scenes. Technical report, Department of Photography, University of Bonn (2009). http://www.ipb.uni-bonn.de/projects/etrims_db/
Lawrence, N.D., Jordan, M.I.: Semi-supervised learning via gaussian processes. In: Advances in Neural Information Processing Systems, pp. 753–760 (2005)
Leistner, C., Saffari, A., Santner, J., Bischof, H.: Semi-supervised random forests. In: Proceedings of the 2009 International Conference on Computer Vision (ICCV’09), pp. 506–513 (2009)
Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., Belongie, S.: Objects in context. In: Proceedings of the 2007 International Conference on Computer Vision (ICCV’07), pp. 1–8 (2007)
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). MIT Press, Cambridge (2005)
Google Scholar
Ripperda, N., Brenner, C.: Evaluation of structure recognition using labelled facade images. In: Proceedings of the DAGM, pp. 532–541 (2009)
Rodner, E., Hegazy, D., Denzler, J.: Multiple kernel gaussian process classification for generic 3d object recognition from time-of-flight images. In: Proceedings of the International Conference on Image and Vision Computing (2010)
van de Sande, K., Gevers, T., Snoek, C.: Evaluating color descriptors for object and scene recognition. PAMI 32, 1582–1596 (2010)
Google Scholar
Schölkopf, B., Smola, A.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Google Scholar
Shen, Y., Ng, A., Seeger, M.: Fast gaussian process regression using kd-trees. In. Advances in Neural Information Processing Systems, pp. 1225–1232 (2006)
Shotton, J., Johnson, M., Cipolla, R.: Semantic texton forests for image categorization and segmentation. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08), pp. 1–8 (2008)
Shotton, J., Winn, J.M., Rother, C., Criminisi, A.: Textonboost: joint appearance, shape and context modeling for multi-class object recognition and segmentation. In: Proceedings of the European Conference of Computer Vision (ECCV’06), pp. 1–15 (2006)
Simon, L., Teboul, O., Koutsourakis, P., Paragios, N.: Random exploration of the procedural space for single-view 3d modeling of buildings. Int. J. Comput. Vis. 93, 253–271 (2011)
Article MathSciNet MATH Google Scholar
Snelson, E., Ghahramani, Z.: Sparse gaussian processes using pseudo-inputs. In: Advances in Neural Information Processing Systems (2006)
Teboul, O.: Shape Grammar Parsing: Application to Image-Based Modeling. PhD thesis, Ecole Centrale de Paris (2011)
Teboul, O., Kokkinos, I., Koutsourakis, P., Simon, L., Paragios, N.: Shape grammar parsing via reinforcement learning. In: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2313–2319 (2011)
Teboul, O., Simon, L., Koutsourakis, P., Paragios, N.: Segmentation of building facades using procedural shape priors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2010)
Tipping, M.E.: Sparse bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 1, 211–244 (2001)
MathSciNet MATH Google Scholar
Tresp, V.: A bayesian committee machine. Neural Comput. 12, 2719–2741 (2000)
Article Google Scholar
Tsang, I.W., Kocsor, A., Kwok, J.T.: Simpler core vector machines with enclosing balls. In: Proceedings of the 24th international conference on Machine learning, pp. 911–918 (2007)
Urtasun, R., Darrell, T.: Sparse probabilistic regression for activity-independent human pose inference. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’08) (2008)
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Berlin (1995)
Book MATH Google Scholar
Williams, C.K., Seeger, M.: Using the nyström method to speed up kernel machines. In: Advances in Neural Information Processing Systems, pp. 682–688 (2001)
Xiao, J., Fang, T., Zhao, P., Lhuillier, M., Quan, L.: Image-based street-side city modeling. ACM Trans. Graph. 28(5) (2009)
Xiao, J., Hays, J., Ehinger, K., Oliva, A., Torralba, A.: Sun database: Large-scale scene recognition from abbey to zoo. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3485–3492 (2010). doi:10.1109/CVPR.2010.5539970
Xiao, J., Quan, L.: Multiple view semantic segmentation for street view images. In: Proceedings of 12th IEEE International Conference on Computer Vision, pp. 686–693 (2009)
Yang, M.Y., Forstner, W.: A hierarchical conditional random field model for labeling and classifying images of man-made scenes. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 196–203 (2011). doi:10.1109/ICCVW.2011.6130243
Yang, M.Y., Förstner, W.: Regionwise classification of building facade images. In: Photogrammetric Image Analysis. Lecture Notes in Computer Science vol. 6952, pp. 209–220. Springer, Berlin (2011)

Download references

Acknowledgments

This work was partially supported by the Graduate School on Image Processing and Image Interpretation funded by the state of Thuringia/Germany.

Author information

Authors and Affiliations

Friedrich Schiller University, Jena, Germany
Björn Fröhlich, Erik Rodner, Michael Kemmler & Joachim Denzler
ICSI, UC Berkeley, USA
Erik Rodner

Authors

Björn Fröhlich
View author publications
You can also search for this author in PubMed Google Scholar
Erik Rodner
View author publications
You can also search for this author in PubMed Google Scholar
Michael Kemmler
View author publications
You can also search for this author in PubMed Google Scholar
Joachim Denzler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Björn Fröhlich.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fröhlich, B., Rodner, E., Kemmler, M. et al. Large-scale gaussian process multi-class classification for semantic segmentation and facade recognition. Machine Vision and Applications 24, 1043–1053 (2013). https://doi.org/10.1007/s00138-012-0480-y

Download citation

Received: 22 February 2012
Revised: 28 November 2012
Accepted: 17 December 2012
Published: 04 January 2013
Issue Date: July 2013
DOI: https://doi.org/10.1007/s00138-012-0480-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Large-scale gaussian process multi-class classification for semantic segmentation and facade recognition

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

Deep learning models for digital image processing: a review

A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Large-scale gaussian process multi-class classification for semantic segmentation and facade recognition

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

Deep learning models for digital image processing: a review

A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation