Joint multi-task cascade for instance segmentation

Wen, Yaole; Hu, Fuyuan; Ren, Jinchang; Shang, Xinru; Li, Linyan; Xi, Xuefeng

doi:10.1007/s11554-020-01007-5

Joint multi-task cascade for instance segmentation

Special Issue Paper
Published: 08 August 2020

Volume 17, pages 1983–1989, (2020)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Yaole Wen^1,4,
Fuyuan Hu^1,5,
Jinchang Ren²,
Xinru Shang¹,
Linyan Li³ &
…
Xuefeng Xi¹

402 Accesses
5 Citations
Explore all metrics

Abstract

Instance segmentation requires both pixel-level classification accuracy and high-level semantic features at the target instance level, which is very challenging, and the cascade structure can effectively improve both of these problems. To make full use of the relationship between detection and segmentation, this paper proposes a joint multi-tasking cascade structure, which is not simply to cascade the two tasks of detection and segmentation, but to unitedly put them into multi-stage processing, and especially to integrate the information at different stages of the mask branch. The entire structure can effectively utilize the superior characteristics of each stage in the matter of detection and segmentation, thus improving the quality of mask prediction. The feature fusion process is introduced in the full convolution networks (FCN) branch, and the high-level and low-level features are effectively fused to enhance the contextual information of the picture semantic features. The experiments demonstrate the better results on the COCO dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

U-Net: Convolutional Networks for Biomedical Image Segmentation

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

Tausif Diwan, G. Anirudh & Jitendra V. Tembhurne

References

Chen, L.C., Papandreou, G., Kokkinos, I., et al.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4), 834–848 (2016)
Article Google Scholar
Dai, J.F., He, K.M., Sun, J.: Instance-aware semantic segmentation via multi-task network cascades. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3150–3158. (2016)
Li, Y., Qi, H.Z., Dai, J.F., et al.: Fully convolutional instance-aware semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4438–4446. (2017)
Dai, J.F., He, K.M., Li, Y., et al.: Instance-sensitive fully convolutional networks. In: Proceedings of European Conference on Computer Vision, pp. 534–549. (2016)
Ren, M.Y., Zemel ,R.S.: End-to-end instance segmentation with recurrent attention. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6656–6664. (2017)
He, K.M., Gkioxari, G., Dollár, P., et al.: Mask R-CNN. In: Proceedings of IEEE International Conference on Computer Vision, pp. 2961–2969. (2017)
Ren ,S.Q., He, K.M., Girshick, R., et al.: Faster R-CNN: towards realtime object detection with region proposal networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, pp. 1137–1149. MIT Press, Cambridge (2015)
Chen, L.C., Hermans, A., Papandreou, G., et al.: Masklab: instance segmentation by refining object detection with semantic and direction features. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4013–4022. (2018)
Zhang, Z.Y., Fidler, S., Urtasun, R.: Instance-level segmentation for autonomous driving with deep densely connected MRFS. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 669–677. (2016)
Zhang, Z.Y., Schwing, A.G., Fidler, S., et al.: Monocular object instance segmentation and depth ordering with CNNS. In Proceedings of IEEE International Conference on Computer Vision, pp. 2614–2622. (2015)
Arnab, A., Torr, P.H.: Bottom-up instance segmentation using deep higher order CRFS. arXiv preprint arXiv. 2016: 1609.02583
Bai, M., Urtasun, R.: Deep watershed transform for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5221–5229. (2017)
Tang, Y.B., Wu, X.Q.: Scene text detection and segmentation based on cascaded convolution neural networks. IEEE Trans. Image Process. 26(3), 1509–1520 (2017)
Article MathSciNet Google Scholar
Liu, C.S., Chang, F.L.: Hybrid cascade structure for license plate detection in large visual surveillance scenes. IEEE Trans. Intell. Transp. Syst. 20(6), 2122–2135 (2019)
Article MathSciNet Google Scholar
Cai, Z.W., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6154–6162. (2018)
Liu, Y.S., Liu, Y.B., Ding, L.W.: Scene classification based on two-stage deep feature fusion. IEEE Geosci. Remote Sens. Lett. 15(2), 183–186 (2018)
Article Google Scholar
Chen, J.K., Chen, Z.H., Chi, Z.R., et al.: Facial expression recognition in video with multiple feature fusion. IEEE Trans. Affect. Comput. 9(1), 38–50 (2018)
Article Google Scholar
Liu, A.B., Yang, Y.Q., Sun, Q.Y., et al.: A deep fully convolution neural network for semantic segmentation based on adaptive feature fusion. In: International Conference on Information Science and Control Engineering, pp. 16–20. (2018)
Bodla, N., Singh, B., Chellappa, R., et al.: Soft-NMS improving object detection with one line of code. In: Proceedings of IEEE International Conference on Computer Vision, pp. 5561–5569. (2017)
Zabalza, J., Ren, J.C., Zheng, J.B., et al.: Novel two-dimensional singular spectrum analysis for effective feature extraction and data classification in hyperspectral imaging. IEEE Trans. Geosci. Remote Sens. 53(8), 4418–4433 (2015)
Article Google Scholar
Shamsolmoali, P., Zareapoor, M., Wang, R.L., et al.: A novel deep structure U-net for sea-land segmentation in remote sensing images. IEEE J Sel. Top. Appl. Earth Observ. Remote Sens. 12(9), 3219–3232 (2019)
Article Google Scholar
Lin, T.Y., Dollar, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 936–944. (2016)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440. (2015)
Prabhakar, K.R., Srikar, V.S., Babu, R.V.: DeepFuse: a deep unsupervised approach for exposure fusion with extreme exposure image Pairs. In: Proceedings of IEEE International Conference on Computer Vision. (2017)
Liu, S., Qi, L., Qin, H.F., et al.: Path aggregation network for instance segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768. (2018)
Zhang, H., Tian, Y.L., Wang, K.F., et al.: Mask SSD: an effective single-stage approach to object instance segmentation. IEEE Trans. Image Process. 29(1), 2078–2093 (2020)
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of. China (Nos. 61876121 and 61472267), the Foundation of Key Research and Development Projects in Jiangsu Province (No. BE2017663), the Foundation of Natural Science Research Program in Jiangsu Province higher education (Nos. 19KJB520054 and 18KJB510042), the Foundation of Science and Technology Project of Suzhou Water Conservancy and Water affairs (2015-7-5).

Author information

Authors and Affiliations

School of Electronic & Information Engineering, Suzhou University of Science and Technology, Suzhou, 215009, Jiangsu, China
Yaole Wen, Fuyuan Hu, Xinru Shang & Xuefeng Xi
Centre for Signal and Image Processing, University of Strathclyde, Glasgow, UK
Jinchang Ren
Suzhou Institute of Trade & Commerce, Suzhou, 215009, Jiangsu, China
Linyan Li
Virtual Reality Key Laboratory of Intelligent Interaction and Application Technology of Suzhou, Suzhou University of Science and Technology, Suzhou, 215009, Jiangsu, China
Yaole Wen
Suzhou Key Laboratory for Big Data and Information Service, Suzhou University of Science and Technology, Suzhou, 215009, Jiangsu, China
Fuyuan Hu

Authors

Yaole Wen
View author publications
You can also search for this author in PubMed Google Scholar
Fuyuan Hu
View author publications
You can also search for this author in PubMed Google Scholar
Jinchang Ren
View author publications
You can also search for this author in PubMed Google Scholar
Xinru Shang
View author publications
You can also search for this author in PubMed Google Scholar
Linyan Li
View author publications
You can also search for this author in PubMed Google Scholar
Xuefeng Xi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fuyuan Hu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wen, Y., Hu, F., Ren, J. et al. Joint multi-task cascade for instance segmentation. J Real-Time Image Proc 17, 1983–1989 (2020). https://doi.org/10.1007/s11554-020-01007-5

Download citation

Received: 02 November 2019
Accepted: 29 July 2020
Published: 08 August 2020
Issue Date: December 2020
DOI: https://doi.org/10.1007/s11554-020-01007-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Joint multi-task cascade for instance segmentation

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Joint multi-task cascade for instance segmentation

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation