Skip to main content

How to Transfer a Semantic Segmentation Model from Autonomous Driving to Other Domains?

  • Conference paper
  • First Online:
  • 2335 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 693))

Abstract

Semantic scene understanding is an important task for robots operating autonomously in real-world applications. Recent deep convolutional neural networks (CNNs) have demonstrated to be an effective approach for semantic image segmentation, especially for tasks where plenty of labeled data is available. However, many applications need to learn new specific classes but do not have a lot of labeled training data. This paper addresses the problem of transferring the knowledge from existing CNN models, e.g., from autonomous driving applications, to different classes and domains, e.g., different robotic platforms. Our work explores the two common transfer learning approaches for the particular problem of semantic segmentation: (1) fine-tuning existing models with the new training data, following a standard pipeline; (2) training a superpixel classifier using our proposed superpixel representation, which combines local and context information. We evaluate both approaches on three varied binary segmentation use cases from different domains. Our experiments demonstrate advantages and limitations from each alternative, showing that the proposed superpixel based strategies learn better models with limited amounts of labeled data.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. PAMI 34(11), 2274–2282 (2012)

    Article  Google Scholar 

  2. Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. arXiv preprint arXiv:1511.00561 (2015)

  3. Bell, S., Lawrence Zitnick, C., Bala, K., Girshick, R.: Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: CVPR, pp. 2874–2883 (2016)

    Google Scholar 

  4. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)

    Google Scholar 

  5. Chéron, G., Laptev, I., Schmid, C.: P-CNN: pose-based CNN features for action recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3218–3226 (2015)

    Google Scholar 

  6. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR, pp. 248–255. IEEE (2009)

    Google Scholar 

  7. Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: Proceedings of the 31st International Conference on Machine Learning, pp. 647–655 (2014)

    Google Scholar 

  8. Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis. In: CVPR, pp. 4340–4349 (2016)

    Google Scholar 

  9. Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: CVPR, pp. 2414–2423 (2016)

    Google Scholar 

  10. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on CVPR, pp. 580–587 (2014)

    Google Scholar 

  11. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, MM 2014, pp. 675–678. ACM (2014)

    Google Scholar 

  12. Kochanov, D., Ošep, A., Stückler, J., Leibe, B.: Scene flow propagation for semantic mapping and object discovery in dynamic street scenes. In: IROS, pp. 1785–1792. IEEE (2016)

    Google Scholar 

  13. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)

    Google Scholar 

  14. Liu, F., Shen, C., Lin, G.: Deep convolutional neural fields for depth estimation from a single image. In: CVPR, pp. 5162–5170 (2015)

    Google Scholar 

  15. Mostajabi, M., Yadollahpour, P., Shakhnarovich, G.: Feedforward semantic segmentation with zoom-out features. In: Proceedings of the IEEE Conference on CVPR, pp. 3376–3385 (2015)

    Google Scholar 

  16. Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from rgbd images. In: ECCV (2012)

    Google Scholar 

  17. Pinheiro, P.O., Collobert, R., Dollar, P.: Learning to segment object candidates. In: NIPS, pp. 1990–1998 (2015)

    Google Scholar 

  18. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)

    Google Scholar 

  19. Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: Labelme: a database and web-based tool for image annotation. IJCV 77(1–3), 157–173 (2008)

    Article  Google Scholar 

  20. Schneider, L., Cordts, M., Rehfeld, T., Pfeiffer, D., Enzweiler, M., Franke, U., Pollefeys, M., Roth, S.: Semantic stixels: depth is not enough. In: Intelligent Vehicles Symposium, pp. 110–117. IEEE (2016)

    Google Scholar 

  21. Shin, H.C., Roth, H.R., Gao, M., Lu, L., Xu, Z., Nogues, I., Yao, J., Mollura, D., Summers, R.M.: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans. Med. Imaging PP(99), 1 (2016)

    Google Scholar 

  22. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)

    Google Scholar 

  23. Wang, K., Belongie, S.: Word spotting in the wild. In: ECCV, pp. 591–604 (2010)

    Google Scholar 

  24. Yan, J., Yu, Y., Zhu, X., Lei, Z., Li, S.Z.: Object detection by labeling superpixels. In: CVPR (2015)

    Google Scholar 

  25. Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: NIPS, pp. 487–495 (2014)

    Google Scholar 

Download references

Acknowledgments

We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research. This research has been partially funded by the European Research Council (ERC) under the EU Horizon 2020 program (CHAMELEON project, grant agreement No 682080), the Spanish Government (projects DPI2015-65962-R, DPI2015-69376-R) and Aragon regional government (Grupo DGA T04-FSE).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ana B. Cambra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cambra, A.B., Muñoz, A., Murillo, A.C. (2018). How to Transfer a Semantic Segmentation Model from Autonomous Driving to Other Domains?. In: Ollero, A., Sanfeliu, A., Montano, L., Lau, N., Cardeira, C. (eds) ROBOT 2017: Third Iberian Robotics Conference. ROBOT 2017. Advances in Intelligent Systems and Computing, vol 693. Springer, Cham. https://doi.org/10.1007/978-3-319-70833-1_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70833-1_53

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70832-4

  • Online ISBN: 978-3-319-70833-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics