Multi-level graph convolutional recurrent neural network for semantic image segmentation

Jiang, Dingchao; Qu, Hua; Zhao, Jihong; Zhao, Jianlong; Liang, Wei

doi:10.1007/s11235-021-00769-y

Multi-level graph convolutional recurrent neural network for semantic image segmentation

Published: 25 March 2021

Volume 77, pages 563–576, (2021)
Cite this article

Telecommunication Systems Aims and scope Submit manuscript

Dingchao Jiang¹,
Hua Qu¹,
Jihong Zhao²,
Jianlong Zhao³ &
…
Wei Liang ORCID: orcid.org/0000-0002-5074-1363⁴

780 Accesses
6 Citations
3 Altmetric
Explore all metrics

Abstract

With the advent of the Internet of Things (IoT) era, many devices have surfaced that capture and generate various visual data. To recognize and extract a meaningful pattern from these visual data, powerful methods are required for different IoT applications. Fortunately, deep convolutional neural networks (CNNs) significantly improve the performance of almost all tasks in computer vision, including semantic image segmentation. However, the feature extraction of CNNs may cause the loss of contextual and spatial information. Moreover, the standard convolutional and pooling layers adopted by most CNN architectures lead to a fixed receptive field, which makes it challenging to deal with multi-scale objects in the image. To remedy these issues of CNNs for semantic image segmentation, this paper proposes a multi-level graph convolutional recurrent neural network (MGCRNN) to combine CNNs and graph neural networks (GNNs) for fusing multi-level features. By applying graph convolutional recurrent neural network (GCRNN), the proposed model acquires a global view of the image and aggregates multi-level contextual and structural information. The experiments verify the ability of GCRNN to obtain a flexible receptive field and learn structure features without losing spatial information. Results of these experiments conducted on the Pascal VOC 2012 and Cityscapes datasets show that the proposed model outperforms baseline approaches and can be competitive with state-of-the-art methods

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Graph-FCN for Image Semantic Segmentation

Introducing Semantic-Based Receptive Field into Semantic Segmentation via Graph Neural Networks

Applications of graph convolutional networks in computer vision

Article 26 May 2022

References

Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 234-241).
Liang, X., Shen, X., Feng, J., Lin, L., & Yan, S. (2016). Semantic object parsing with graph lstm. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 125-143).
Chen, Y., Rohrbach, M., Yan, Z., Shuicheng, Y., Feng, J., & Kalantidis, Y. (2019). Graph-based global reasoning networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 433–442.
Qi, X., Liao, R., Jia, J., Fidler, S., & Urtasun, R. (2017). 3d graph neural networks for rgbd semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (pp. 5199-5208).
Liang, W., Xingming, S., Zhiqiang, R., Jing, L., & Chengtao, W. (2011). A Sequential Circuit-Based IP Watermarking Algorithm for Multiple Scan Chains in Design-for-Test. Radioengineering,20(2).
Li, Xiong, Tan, Jiawei, Liu, Anfeng, Vijayakumar, Pandi, Kumar, Neeraj, & Alazab, Mamoun. (2020). A Novel UAV-enabled Data Collection Scheme for Intelligent Transportation System through UAV Speed Control. IEEE Transactions on Intelligent Transportation Systems,. https://doi.org/10.1109/TITS.2020.3040557.
Article Google Scholar
Liang, W., Huang, W., Long, J., Zhang, K., Li, K.-C., & Zhang, D. (2020). Deep reinforcement learning for resource protection and real-time detection in IoT environment. IEEE Internet of Things Journal, 7, 6392–6401. https://doi.org/10.1109/JIOT.2020.2974281.
Article Google Scholar
Li, X., Liu, T., Obaidat, M. S., Wu, F., Vijayakumar, P., & Kumar, N. (2020). A Lightweight Privacy-Preserving Authentication Protocol for VANETs. IEEE Systems Journal, 14(3), 3547–3557.
Article Google Scholar
Li, X., Liu, S., Wu, F., Kumari, S., & Rodrigues, J. J. (2018). Privacy preserving data aggregation scheme for mobile edge computing assisted IoT applications. IEEE Internet of Things Journal, 6, 4755–4763.
Article Google Scholar
Liang, W., Liao, B., Long, J., Jiang, Y., & Peng, L. (2016). Study on PUF based secure protection for IC design. Microprocessors and Microsystems, 45, 56–66.
Article Google Scholar
Yuan, Y., & Wang, J. (2018). Ocnet: Object context network for scene parsing. arXiv:1809.00916. Accessed on October 10, 2020.
Liang, W., Li, K.-C., Long, J., Kui, X., & Zomaya, A. Y. (2019). An industrial network intrusion detection algorithm based on multifeature data clustering optimization model. IEEE Transactions on Industrial Informatics, 16(3), 2063–2071.
Article Google Scholar
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3431-3440).
Noh, H., Hong, S., & Han, B. (2015). Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (pp. 1520-1528).
Yang, M., Yu, K., Zhang, C., Li, Z., & Yang, K. (2018). Denseaspp for semantic segmentation in street scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3684-3692).
Chen, L.-C., Papandreou, G., Schroff, F., & Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv:1706.05587. Accessed on October 10, 2020.
Landrieu, L., & Simonovsky, M. (2018). Large-Scale Point Cloud Semantic Segmentation with Superpoint Graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 4558-4567).
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980. Accessed on October 10, 2020.
Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017). Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2881-2890).
Yu, F., & Koltun, V. (2015). Multi-scale context aggregation by dilated convolutions. arXiv:1511.07122. Accessed on October 10, 2020.
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848. https://doi.org/10.1109/TPAMI.2017.2699184.
Article Google Scholar
Zhang, H., Dana, K., Shi, J., Zhang, Z., Wang, X., Tyagi, A., & Agrawal, A. (2018). Context encoding for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 7151-7160).
Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C., & Torr, P. H. (2015). Conditional random fields as recurrent neural networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (pp. 1529-1537).
Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2), 303–338. https://doi.org/10.1007/s11263-009-0275-4.
Article Google Scholar
Wang, X., Ye, Y., & Gupta, A. (2018). Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 6857-6866).
Bertasius, G., Torresani, L., Yu, S. X., & Shi, J. (2017). Convolutional random walk networks for semantic image segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 858-866).
Lu, Y., Chen, Y., Zhao, D., Liu, B., Lai, Z., & Chen, J. (2020). CNN-G: Convolutional Neural Network Combined with Graph for Image Segmentation with Theoretical Analysis. IEEE Transactions on Cognitive and Developmental Systems,. https://doi.org/10.1109/TCDS.2020.2998497.
Article Google Scholar
Li, Y., Yu, R., Shahabi, C., & Liu, Y. (2017). Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. arXiv:1707.01926. Accessed on October 10, 2020.
Li, G., Muller, M., Qian, G., Delgadillo, I. C., Abualshour, A., Thabet, A. K., & Ghanem, B. (2019). DeepGCNs: Making GCNs Go as Deep as CNNs. arXiv:1910.06849. Accessed on October 10, 2020.
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., et al. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3213-3223).
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., & Adam, H. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 801-818).
Liu, Q., Kampffmeyer, M. C., Jenssen, R., & Salberg, A.-B. (2020). Multi-view Self-Constructing Graph Convolutional Networks with Adaptive Class Weighting Loss for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR) (pp. 44-45).
Krähenbühl, P., & Koltun, V. (2011). Efficient inference in fully connected crfs with gaussian edge potentials. In Proceedings of Advances in Neural Information Processing Systems (pp. 109-117).
Pinheiro, P., & Collobert, R. (2014). Recurrent convolutional neural networks for scene labeling. In Proceedings of the International Conference on Machine Learning (pp. 82-90).
Shen, F., Gan, R., Yan, S., & Zeng, G. (2017). Semantic segmentation via structured patch prediction, context crf and guidance crf. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp 1953-1961).
Gori, M., Monfardini, G., & Scarselli, F. (2005). A new model for learning in graph domains. In Proceedings of the IEEE International Joint Conference on Neural Networks, Montreal (pp. 729-734).
Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M., & Monfardini, G. (2008). The graph neural network model. IEEE Transactions on Neural Networks, 20(1), 61–80. https://doi.org/10.1109/TNN.2008.2005605.
Article Google Scholar
Zhang, L., Li, X., Arnab, A., Yang, K., Tong, Y., & Torr, P. H. (2020). Dual graph convolutional network for semantic segmentation. arXiv:1909.06121. Accessed on October 10, 2020.
Yu, B., Yin, H., & Zhu, Z. (2017). Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv:1709.04875. Accessed on October 10, 2020.
Garcia, V., & Bruna, J. (2017). Few-Shot Learning with Graph Neural Networks. arXiv:1711.04043. Accessed on October 10, 2020.
Qi, C. R., Yi, L., Su, H., & Guibas, L. J. (2017). Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Proceedings of Advances in Neural Information Processing Systems (pp. 5099-5108).
Klokov, R., & Lempitsky, V. (2017). Escape from cells: Deep kd-networks for the recognition of 3d point cloud models. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (pp. 863-872).

Download references

Acknowledgements

This research is supported by the National Key Research and Development Project under Grant 2018YFB1801600.

Author information

Authors and Affiliations

School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, People’s Republic of China
Dingchao Jiang & Hua Qu
School of Telecommunication and Information Engineering, Xi’an University of Posts and Telecommunications, Xi’an, People’s Republic of China
Jihong Zhao
School of Software Engineering, Xi’an Jiaotong University, Xi’an, People’s Republic of China
Jianlong Zhao
College of Computer Science and Electronic Engineering, Hunan University, Changsha, People’s Republic of China
Wei Liang

Authors

Dingchao Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Hua Qu
View author publications
You can also search for this author in PubMed Google Scholar
Jihong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jianlong Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Wei Liang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Liang.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, D., Qu, H., Zhao, J. et al. Multi-level graph convolutional recurrent neural network for semantic image segmentation. Telecommun Syst 77, 563–576 (2021). https://doi.org/10.1007/s11235-021-00769-y

Download citation

Accepted: 13 February 2021
Published: 25 March 2021
Issue Date: July 2021
DOI: https://doi.org/10.1007/s11235-021-00769-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-level graph convolutional recurrent neural network for semantic image segmentation

Abstract

Access this article

Similar content being viewed by others

Graph-FCN for Image Semantic Segmentation

Introducing Semantic-Based Receptive Field into Semantic Segmentation via Graph Neural Networks

Applications of graph convolutional networks in computer vision

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-level graph convolutional recurrent neural network for semantic image segmentation

Abstract

Access this article

Similar content being viewed by others

Graph-FCN for Image Semantic Segmentation

Introducing Semantic-Based Receptive Field into Semantic Segmentation via Graph Neural Networks

Applications of graph convolutional networks in computer vision

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation