Skip to main content

A Novel Multi-task Architecture for Vanishing Point Assisted Road Segmentation and Guidance in Off-Road Environments

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14087))

Included in the following conference series:

Abstract

Despite the widespread application of convolutional neural network (CNN) based and transformer based models for road segmentation task to provide driving vehicles with valuable information, there is currently no reliable and safe solution specifically designed for harsh off-road environments. In order to address this challenge, we proposed a multi-task network (VPrs-Net) capable of simultaneously learning two tasks: vanishing point (VP) detection and road segmentation. By utilizing road clue provided by the VP, VPrs-Net achieves more accurate performance in identifying drivable areas of harsh off-road environments. Moreover, the model guided by the VP can further enhance the safety performance of driving vehicles. We further proposed a multi-attention architecture for learning of task-specific features from the global features to solve the problem of attentional imbalance in multi-task learning. The public ORFD off-road dataset was used to evaluate performance of our proposed VPrs-Net. Experimental results show that compared to several state-of-the-art algorithms, our model achieved not only 96.91% accuracy in the segmentation task, but also a mean error of NormDist of 0.03288 in road VP detection task. Therefore, the proposed model has demonstrated its potential performance in challenging off-road environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Shen, W., Peng, Z., Wang, X., et al.: A survey on label-efficient deep image segmentation: bridging the gap between weak supervision and dense prediction. IEEE Trans. Pattern Anal. Mach. Intell. 45, 1–20 (2023)

    Google Scholar 

  2. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W., Frangi, A. (eds.) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

  3. Chen, L.C., Zhu, Y., Papandreou, G., et al.: DeepLab v3+: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision, pp. 801–818. Springer, Munich (2018)

    Google Scholar 

  4. Wang, J., Sun, K., Cheng, T., et al.: Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3349–3364 (2020)

    Article  Google Scholar 

  5. Yu C., Wang J., Peng C., et al.: Bisenet: Bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European conference on computer vision, pp. 325–341. Springer, Munich (2018)

    Google Scholar 

  6. Hong, Y., Pan, H., Sun, W., et al.: Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes. IEEE Trans. Intell. Transp. Syst. 24(3), 3448–3460 (2022)

    Google Scholar 

  7. Chu, X., Tian, Z., Wang, Y., et al.: Twins: revisiting the design of spatial attention in vision transformers. Adv. Neural. Inf. Process. Syst. 34, 9355–9366 (2021)

    Google Scholar 

  8. Xie, E., Wang, W., Yu, Z., et al.: SegFormer: simple and efficient design for semantic segmentation with transformers. Adv. Neural Inf. Process. Syst. 34, 12077–12090 (2021)

    Google Scholar 

  9. Liu, Z., Lin, Y., Cao, Y., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022. IEEE, Montreal (2021)

    Google Scholar 

  10. Wang, J., Gou, C., Wu, Q., et al.: RTFormer: efficient design for real-time semantic segmentation with transformer. arXiv:2210.07124 (2022)

  11. Lin, Y., Wiersma, R., Pintea, S.L., et al.: Deep vanishing point detection: Geometric priors make dataset variations vanish. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6103–6113. IEEE, New Orleans (2022)

    Google Scholar 

  12. Lee, S., Kim, J., Shin Yoon, J., et al.: Vpgnet: vanishing point guided network for lane and road marking detection and recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1965–1973. IEEE, Venice (2017)

    Google Scholar 

  13. Liu, Y.-B., Zeng, M., Meng, Q.-H.: Heatmap-based vanishing point boosts lane detection. arXiv:2007.15602 (2020)

  14. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440. IEEE, Boston (2015)

    Google Scholar 

  15. Zhao, H., Shi, J., Qi, X., et al.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890.IEEE, Honolulu (2017)

    Google Scholar 

  16. Ruder S: An overview of multi-task learning in deep neural networks. arXiv:1706.05098 (2017)

  17. Teichmann, M., Weber, M., Zoellner, M., et al.: Multinet: real-time joint semantic reasoning for autonomous driving. In: IEEE Intelligent Vehicles Symposium, pp. 1013–1020. IEEE, Changshu (2018)

    Google Scholar 

  18. Qian, Y., Dolan, J.M., Yang, M.: DLT-net: joint detection of drivable areas, lane lines, and traffic objects. IEEE Trans. Intell. Transp. Syst. 21(11), 4670–4679 (2019)

    Article  Google Scholar 

  19. Wu, D., Liao, M.W., Zhang, W.T., et al.: Yolop: you only look once for panoptic driving perception. Mach. Intell. Res. 19, 1–13 (2022)

    Google Scholar 

  20. Vu, D., Ngo, B., Phan, H.: Hybridnets: end-to-end perception network. arXiv:2203.09035 (2022)

  21. Han, C., Zhao, Q., Zhang, S., et al.: YOLOPv2: better, faster, stronger for panoptic driving perception. arXiv:2208.11434 (2022)

  22. Liu, S., Johns, E., Davison, A.J.: End-to-end multi-task learning with attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1871–1880. IEEE, Seoul (2019)

    Google Scholar 

  23. Kendall, A., Gal, Y., Cipolla, R.: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7482–7491. IEEE, Salt Lake City (2018)

    Google Scholar 

  24. Chen, Z., Badrinarayanan, V., Lee, C.Y., et al.: Gradnorm: gradient normalization for adaptive loss balancing in deep multitask networks. In: 35th International Conference on Machine Learning, pp. 794–803. Stockholm (2018)

    Google Scholar 

  25. Lin, X., Chen, H., Pei, C., et al.: A pareto-efficient algorithm for multiple objective optimization in e-commerce recommendation. In: 13th ACM Conference on Recommender Systems, pp. 20–28. Association for Computing Machinery, Copenhagen (2019)

    Google Scholar 

  26. Bhattacharjee, D., Zhang, T., Süsstrunk, S., et al.: Mult: an end-to-end multitask learning transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12031–12041, IEEE, New Orleans (2022)

    Google Scholar 

  27. Fan, M., Lai, S., Huang, J., et al.: Rethinking bisenet for real-time semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9716–9725. IEEE (2021)

    Google Scholar 

  28. Min, C., Jiang, W., Zhao, D., et al.: ORFD: a dataset and benchmark for off-road freespace detection. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 2532–2538. IEEE, Philadelphia (2022)

    Google Scholar 

Download references

Acknowledgments

This research was funded by the Natural Science Foundation of Shandong Province for Key Project under GrantZR2020KF006, the National Natural Science Foundation of China under Grant 62273164, and the Development Program Project of Youth Innovation Team of Institutions of Higher Learning in Shandong Province. A Project of Shandong Province Higher Educational Science and Technology Program under Grants J16LB06 and J17KA055.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shiyuan Han .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, Y., Fan, X., Han, S., Yu, W. (2023). A Novel Multi-task Architecture for Vanishing Point Assisted Road Segmentation and Guidance in Off-Road Environments. In: Huang, DS., Premaratne, P., Jin, B., Qu, B., Jo, KH., Hussain, A. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2023. Lecture Notes in Computer Science, vol 14087. Springer, Singapore. https://doi.org/10.1007/978-981-99-4742-3_37

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-4742-3_37

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-4741-6

  • Online ISBN: 978-981-99-4742-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics