skip to main content
10.1145/2466715.2466723acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmirageConference Proceedingsconference-collections
research-article

Label transfer exploiting three-dimensional structure for semantic segmentation

Published: 06 June 2013 Publication History

Abstract

This paper deals with the problem of computing a semantic segmentation of an image via label transfer from an already labeled image set. In particular it proposes a method that takes advantage of sparse 3D structure to infer the category of superpixel in the novel image. The label assignment is computed by a Markov random field that has the superpixels of the image as nodes. The data term combines labeling proposals from the appearance of the superpixel and from the 3D structure, while the pairwise term incorporates spatial context, both in the image and in 3D space. Exploratory results indicate that 3D structure, albeit sparse, improves the process of label transfer.

References

[1]
R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk. SLIC Superpixels Compared to State-of-the-art Superpixel Methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(11):2274--2282, 2012.
[2]
G. Brostow, J. Shotton, J. Fauqueur, and R. Cipolla. Segmentation and recognition using structure from motion point clouds. In Proceedings of the European Conference on Computer Vision, 2008.
[3]
G. D. Finlayson, S. D. Hordley, C. Lu, and M. S. Drew. Removing shadows from images. In Proceedings of the European Conference on Computer Vision, pages 823--836, 2002.
[4]
J. Frahm, P. Fite Georgel, D. Gallup, T. Johnson, R. Raguram, C. Wu, Y. Jen, E. Dunn, B. Clipp, S. Lazebnik, and M. Pollefeys. Building rome on a cloudless day. In Proceedings of the European Conference on Computer Vision, pages IV: 368--381, 2010.
[5]
V. Garro, M. Galassi, and A. Fusiello. Wide area camera localization. In Proceedings of the International Conference on Image Analysis and Processing. Springer. To appear.
[6]
R. Gherardi, M. Farenzena, and A. Fusiello. Improving the efficiency of hierarchical structure-and-motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2010.
[7]
S. Gould and Y. Zhang. Patchmatchgraph: Building a graph of dense patch correspondences for label transfer. In Proceedings of the European Conference on Computer Vision, 2012.
[8]
A. Irschara, C. Zach, and H. Bischof. Towards wiki-based dense city modeling. In Proceedings of the International Conference on Computer Vision, pages 1--8. IEEE, 2007.
[9]
A. Irschara, C. Zach, J. Frahm, and H. Bischof. From structure-from-motion point clouds to fast location recognition. In Proceedings of the International Conference on Computer Vision and Pattern Recognition, pages 2599--2606, 2009.
[10]
V. Kolmogorov. Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10):1568--1583, October 2006.
[11]
C. Liu, J. Yuen, A. Torralba, J. Sivic, and W. T. Freeman. Sift flow: Dense correspondence across different scenes. In Proceedings of the 10th European Conference on Computer Vision, pages 28--42, Berlin, Heidelberg, 2008. Springer-Verlag.
[12]
C. Liu, J. Yuen, and A. B. Torralba. Nonparametric scene parsing: Label transfer via dense scene alignment. In Proceedings of the International Conference on Computer Vision and Pattern Recognition, pages 1972--1979, 2009.
[13]
J. Sivic and A. Zisserman. Video Google: A text retrieval approach to object matching in videos. In Proceedings of the International Conference on Computer Vision, volume 2, pages 1470--1477, Oct. 2003.
[14]
N. Snavely, S. M. Seitz, and R. Szeliski. Photo tourism: Exploring photo collections in 3D. In SIGGRAPH Conference Proceedings, pages 835--846, New York, NY, USA, 2006. ACM Press.
[15]
N. Snavely, S. M. Seitz, and R. Szeliski. Modeling the world from internet photo collections. International Journal of Computer Vision, 80(2):189--210, November 2008.
[16]
J. Tighe and S. Lazebnik. Superparsing: scalable nonparametric image parsing with superpixels. In Proceedings of the 11th European conference on Computer vision, pages 352--365, Berlin, Heidelberg, 2010. Springer-Verlag.
[17]
J. Xiao and L. Quan. Multiple view semantic segmentation for street view images. In Proceedings of the International Conference on Computer Vision, 2009.
[18]
H. Zhang, J. Xiao, and L. Quan. Supervised label transfer for semantic segmentation of street scenes. In Proceedings of the 11th European conference on Computer vision, pages 561--574, Berlin, Heidelberg, 2010. Springer-Verlag.

Cited By

View all
  • (2020)Large-Scale Volumetric Scene Reconstruction using LiDAR2020 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA40945.2020.9197388(6261-6267)Online publication date: May-2020
  • (2020)Aligning Videos in Space and TimeComputer Vision – ECCV 202010.1007/978-3-030-58574-7_16(262-278)Online publication date: 13-Nov-2020
  • (2015)Can humans fly? Action understanding with multiple classes of actors2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR.2015.7298839(2264-2273)Online publication date: Jun-2015

Index Terms

  1. Label transfer exploiting three-dimensional structure for semantic segmentation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    MIRAGE '13: Proceedings of the 6th International Conference on Computer Vision / Computer Graphics Collaboration Techniques and Applications
    June 2013
    137 pages
    ISBN:9781450320238
    DOI:10.1145/2466715
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    • Humboldt Univ.: Humboldt-Universität zu Berlin
    • FHHI: Fraunhofer Heinrich Hertz Institute

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 June 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Markov random fields
    2. image parsing
    3. image understanding
    4. labeling
    5. segmentation
    6. structure from motion

    Qualifiers

    • Research-article

    Conference

    MIRAGE '13
    Sponsor:
    • Humboldt Univ.
    • FHHI

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)Large-Scale Volumetric Scene Reconstruction using LiDAR2020 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA40945.2020.9197388(6261-6267)Online publication date: May-2020
    • (2020)Aligning Videos in Space and TimeComputer Vision – ECCV 202010.1007/978-3-030-58574-7_16(262-278)Online publication date: 13-Nov-2020
    • (2015)Can humans fly? Action understanding with multiple classes of actors2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR.2015.7298839(2264-2273)Online publication date: Jun-2015

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media