Skip to main content
Log in

Multi-level graph convolutional recurrent neural network for semantic image segmentation

  • Published:
Telecommunication Systems Aims and scope Submit manuscript

Abstract

With the advent of the Internet of Things (IoT) era, many devices have surfaced that capture and generate various visual data. To recognize and extract a meaningful pattern from these visual data, powerful methods are required for different IoT applications. Fortunately, deep convolutional neural networks (CNNs) significantly improve the performance of almost all tasks in computer vision, including semantic image segmentation. However, the feature extraction of CNNs may cause the loss of contextual and spatial information. Moreover, the standard convolutional and pooling layers adopted by most CNN architectures lead to a fixed receptive field, which makes it challenging to deal with multi-scale objects in the image. To remedy these issues of CNNs for semantic image segmentation, this paper proposes a multi-level graph convolutional recurrent neural network (MGCRNN) to combine CNNs and graph neural networks (GNNs) for fusing multi-level features. By applying graph convolutional recurrent neural network (GCRNN), the proposed model acquires a global view of the image and aggregates multi-level contextual and structural information. The experiments verify the ability of GCRNN to obtain a flexible receptive field and learn structure features without losing spatial information. Results of these experiments conducted on the Pascal VOC 2012 and Cityscapes datasets show that the proposed model outperforms baseline approaches and can be competitive with state-of-the-art methods

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 234-241).

  2. Liang, X., Shen, X., Feng, J., Lin, L., & Yan, S. (2016). Semantic object parsing with graph lstm. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 125-143).

  3. Chen, Y., Rohrbach, M., Yan, Z., Shuicheng, Y., Feng, J., & Kalantidis, Y. (2019). Graph-based global reasoning networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 433–442.

  4. Qi, X., Liao, R., Jia, J., Fidler, S., & Urtasun, R. (2017). 3d graph neural networks for rgbd semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (pp. 5199-5208).

  5. Liang, W., Xingming, S., Zhiqiang, R., Jing, L., & Chengtao, W. (2011). A Sequential Circuit-Based IP Watermarking Algorithm for Multiple Scan Chains in Design-for-Test. Radioengineering,20(2).

  6. Li, Xiong, Tan, Jiawei, Liu, Anfeng, Vijayakumar, Pandi, Kumar, Neeraj, & Alazab, Mamoun. (2020). A Novel UAV-enabled Data Collection Scheme for Intelligent Transportation System through UAV Speed Control. IEEE Transactions on Intelligent Transportation Systems,. https://doi.org/10.1109/TITS.2020.3040557.

    Article  Google Scholar 

  7. Liang, W., Huang, W., Long, J., Zhang, K., Li, K.-C., & Zhang, D. (2020). Deep reinforcement learning for resource protection and real-time detection in IoT environment. IEEE Internet of Things Journal, 7, 6392–6401. https://doi.org/10.1109/JIOT.2020.2974281.

    Article  Google Scholar 

  8. Li, X., Liu, T., Obaidat, M. S., Wu, F., Vijayakumar, P., & Kumar, N. (2020). A Lightweight Privacy-Preserving Authentication Protocol for VANETs. IEEE Systems Journal, 14(3), 3547–3557.

    Article  Google Scholar 

  9. Li, X., Liu, S., Wu, F., Kumari, S., & Rodrigues, J. J. (2018). Privacy preserving data aggregation scheme for mobile edge computing assisted IoT applications. IEEE Internet of Things Journal, 6, 4755–4763.

    Article  Google Scholar 

  10. Liang, W., Liao, B., Long, J., Jiang, Y., & Peng, L. (2016). Study on PUF based secure protection for IC design. Microprocessors and Microsystems, 45, 56–66.

    Article  Google Scholar 

  11. Yuan, Y., & Wang, J. (2018). Ocnet: Object context network for scene parsing. arXiv:1809.00916. Accessed on October 10, 2020.

  12. Liang, W., Li, K.-C., Long, J., Kui, X., & Zomaya, A. Y. (2019). An industrial network intrusion detection algorithm based on multifeature data clustering optimization model. IEEE Transactions on Industrial Informatics, 16(3), 2063–2071.

    Article  Google Scholar 

  13. Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3431-3440).

  14. Noh, H., Hong, S., & Han, B. (2015). Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (pp. 1520-1528).

  15. Yang, M., Yu, K., Zhang, C., Li, Z., & Yang, K. (2018). Denseaspp for semantic segmentation in street scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3684-3692).

  16. Chen, L.-C., Papandreou, G., Schroff, F., & Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587. Accessed on October 10, 2020.

  17. Landrieu, L., & Simonovsky, M. (2018). Large-Scale Point Cloud Semantic Segmentation with Superpoint Graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 4558-4567).

  18. Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980. Accessed on October 10, 2020.

  19. Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017). Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2881-2890).

  20. Yu, F., & Koltun, V. (2015). Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122. Accessed on October 10, 2020.

  21. Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848. https://doi.org/10.1109/TPAMI.2017.2699184.

    Article  Google Scholar 

  22. Zhang, H., Dana, K., Shi, J., Zhang, Z., Wang, X., Tyagi, A., & Agrawal, A. (2018). Context encoding for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 7151-7160).

  23. Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., & Torr, P. H. (2015). Conditional random fields as recurrent neural networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (pp. 1529-1537).

  24. Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2), 303–338. https://doi.org/10.1007/s11263-009-0275-4.

    Article  Google Scholar 

  25. Wang, X., Ye, Y., & Gupta, A. (2018). Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 6857-6866).

  26. Bertasius, G., Torresani, L., Yu, S. X., & Shi, J. (2017). Convolutional random walk networks for semantic image segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 858-866).

  27. Lu, Y., Chen, Y., Zhao, D., Liu, B., Lai, Z., & Chen, J. (2020). CNN-G: Convolutional Neural Network Combined with Graph for Image Segmentation with Theoretical Analysis. IEEE Transactions on Cognitive and Developmental Systems,. https://doi.org/10.1109/TCDS.2020.2998497.

    Article  Google Scholar 

  28. Li, Y., Yu, R., Shahabi, C., & Liu, Y. (2017). Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. arXiv:1707.01926. Accessed on October 10, 2020.

  29. Li, G., Muller, M., Qian, G., Delgadillo, I. C., Abualshour, A., Thabet, A. K., & Ghanem, B. (2019). DeepGCNs: Making GCNs Go as Deep as CNNs. arXiv:1910.06849. Accessed on October 10, 2020.

  30. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., et al. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3213-3223).

  31. Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., & Adam, H. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 801-818).

  32. Liu, Q., Kampffmeyer, M. C., Jenssen, R., & Salberg, A.-B. (2020). Multi-view Self-Constructing Graph Convolutional Networks with Adaptive Class Weighting Loss for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR) (pp. 44-45).

  33. Krähenbühl, P., & Koltun, V. (2011). Efficient inference in fully connected crfs with gaussian edge potentials. In Proceedings of Advances in Neural Information Processing Systems (pp. 109-117).

  34. Pinheiro, P., & Collobert, R. (2014). Recurrent convolutional neural networks for scene labeling. In Proceedings of the International Conference on Machine Learning (pp. 82-90).

  35. Shen, F., Gan, R., Yan, S., & Zeng, G. (2017). Semantic segmentation via structured patch prediction, context crf and guidance crf. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp 1953-1961).

  36. Gori, M., Monfardini, G., & Scarselli, F. (2005). A new model for learning in graph domains. In Proceedings of the IEEE International Joint Conference on Neural Networks, Montreal (pp. 729-734).

  37. Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., & Monfardini, G. (2008). The graph neural network model. IEEE Transactions on Neural Networks, 20(1), 61–80. https://doi.org/10.1109/TNN.2008.2005605.

    Article  Google Scholar 

  38. Zhang, L., Li, X., Arnab, A., Yang, K., Tong, Y., & Torr, P. H. (2020). Dual graph convolutional network for semantic segmentation. arXiv:1909.06121. Accessed on October 10, 2020.

  39. Yu, B., Yin, H., & Zhu, Z. (2017). Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv:1709.04875. Accessed on October 10, 2020.

  40. Garcia, V., & Bruna, J. (2017). Few-Shot Learning with Graph Neural Networks. arXiv:1711.04043. Accessed on October 10, 2020.

  41. Qi, C. R., Yi, L., Su, H., & Guibas, L. J. (2017). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Proceedings of Advances in Neural Information Processing Systems (pp. 5099-5108).

  42. Klokov, R., & Lempitsky, V. (2017). Escape from cells: Deep kd-networks for the recognition of 3d point cloud models. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (pp. 863-872).

Download references

Acknowledgements

This research is supported by the National Key Research and Development Project under Grant 2018YFB1801600.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Liang.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, D., Qu, H., Zhao, J. et al. Multi-level graph convolutional recurrent neural network for semantic image segmentation. Telecommun Syst 77, 563–576 (2021). https://doi.org/10.1007/s11235-021-00769-y

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11235-021-00769-y

Keywords

Navigation