skip to main content
10.1145/3565291.3565331acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbdtConference Proceedingsconference-collections
research-article

DCT-CenterMask: A Real-Time Instance Segmentation Network for Kidney Ultrasound Images

Published: 16 December 2022 Publication History

Abstract

Ultrasound is the preferred imaging method to detect chronic kidney disease. It has the characteristics of real-time dynamic imaging. The existing real-time instance segmentation model can meet the real-time requirements of 30fps of segmentation speed, but there is still a certain gap in segmentation accuracy compared with the instance segmentation model. In this paper, a real-time instance segmentation model based on CenterMask is proposed. Using the characteristics that discrete cosine transform (DCT) can reduce the energy loss in the process of down sampling the ground truth mask, a new mask representation is trained on the premise of maintaining the real-time segmentation speed. We also integrate deformable convolution and a new attention mechanism into the network to improve the overall receptive field range of the network and the ability to learn detailed features. For further research, we made a COCO format kidney instance segmentation dataset. Through experiments, our model has improved the mask AP accuracy by about 6% compared with the original model.

References

[1]
Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross Girshick. 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision. IEEE, Venice, Italy, 2961-2969. https://doi.org/10.1109/TPAMI.2018.2844175
[2]
Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Boston, MA, USA, 3431-3440. https://doi.org/10.1109/CVPR.2015.7298965
[3]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Advances in neural information processing systems 28 (April 2015), 1-9. https://doi.org/10.1109/TPAMI.2016.2577031
[4]
Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia. 2018. Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Salt Lake City, UT, USA, 8759-8768. https://doi.org/10.1109/CVPR.2018.00913
[5]
Youngwan Lee and Jongyoul Park. 2020. Centermask: Real-time anchor-free instance segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. IEEE, Seattle, WA, USA, 13906-13915. https://doi.org/10.48550/arXiv.1911.06667
[6]
Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen Wei. 2017. Deformable convolutional networks. In Proceedings of the IEEE international conference on computer vision. IEEE, Venice, Italy, 764-773. https://doi.org/10.1109/ICCV.2017.89
[7]
Youngwan Lee, Joong-won Hwang, Sangrok Lee, Yuseok Bae, and Jongyoul Park. 2019. An energy and GPU-computation efficient backbone network for real-time object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. IEEE, Online, 1-11. https://arxiv.org/abs/1904.09730v1
[8]
Lingxiao Yang, Ru-Yuan Zhang, Lida Li, and Xiaohua Xie. 2021. SimAM: A Simple, Parameter-Free Attention Module for Convolutional Neural Networks. In Proceedings of the International conference on machine learning. PMLR, Baltimore, Maryland USA, 11863-11874.
[9]
Zhaojin Huang, Lichao Huang, Yongchao Gong, Chang Huang, and Xinggang Wang. 2019. Mask scoring r-cnn. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. IEEE, Los Angeles CA, United States, 6409-6418. https://doi.org/10.1109/CVPR.2019.00657
[10]
Zhaowei Cai and Nuno Vasconcelos. 2018. Cascade R-CNN: Delving Into High Quality Object Detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Salt Lake City, UT, USA, 6154-6162. https://doi.org/10.1109/CVPR.2018.00644
[11]
Zhiqiang Lv, Jianbo Li, Chuanhao Dong, and Zhihao Xu. 2021. DeepSTF: A deep spatial–temporal forecast model of taxi flow. The Computer Journal (November 2021), 1-16. https://doi.org/10.1093/comjnl/bxab178
[12]
Yanghao Li, Yuntao Chen, Naiyan Wang, and Zhaoxiang Zhang. 2019. Scale-Aware Trident Networks for Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, Seoul, Korea, 6054-6063. https://doi.org/10.1109/ICCV.2019.00615
[13]
Daniel Bolya, Chong Zhou, Fanyi Xiao, and Yong Jae Lee. 2019. YOLACT: Real-Time Instance Segmentation. In Proceedings of the IEEE/CVF international conference on computer vision. IEEE, Seoul, Korea, 9157-9166. https://doi.org/10.1109/ICCV.2019.00925
[14]
Zhi Tian, Chunhua Shen, Hao Chen, and Tong He. 2019. FCOS: Fully Convolutional One-Stage Object Detection. In Proceedings of the IEEE/CVF international conference on computer vision. IEEE, Seoul, Korea, 9627-9636. https://doi.org/10.1109/ICCV.2019.00972
[15]
Haoran Li, Jianbo Li, Zhiqiang Lv, and Zhihao Xu. 2021. MFAGCN: Multi-Feature Based Attention Graph Convolutional Network for Traffic Prediction. In Proceedings of the International Conference on Wireless Algorithms, Systems, and Applications. Springer, Nanjing, China, 227-239. https://doi.org/10.1007/978-3-030-85928-2_18
[16]
Yue Wang, Zhiqiang. Lv, Zhaoyu Sheng, Haokai Sun, and Aite Zhao. 2022. A deep spatio-temporal meta-learning model for urban traffic revitalization index prediction in the COVID-19 pandemic. Advanced Engineering Informatics 53 (August 2022), 101678. https://doi.org/10.1016/j.aei.2022.101678
[17]
Jie Wei. 2002. Image segmentation based on situational DCT descriptors. Pattern Recognition Letters 23, 3 (January 2002), 295-302. https://doi.org/10.1016/S0167-8655(01)00124-6
[18]
Matej Ulicny and Rozenn Dahyot. 2017. On using CNN with DCT based Image Data. Computer Science (January 2017), 1-9.
[19]
Shao-Yuan Lo and Hsueh-Ming Hang. 2019. Exploring Semantic Segmentation on the DCT Representation. In Proceedings of the ACM Multimedia Asia. ACM, 1-6. https://doi.org/10.1145/3338533.3366557
[20]
Kai Xu, Minghai Qin, Fei Sun, Yuhao Wang, Yen-Kuang Chen, and Fengbo Ren. 2020. Learning in the Frequency Domain. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, Seattle, WA, USA, 1740-1749. https://doi.org/10.1109/CVPR42600.2020.00181
[21]
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-Excitation Networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Beijing, China, 7132-7141. https://doi.org/10.1109/10.1109/TPAMI.2019.2913372
[22]
Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2016. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Honolulu, Hawaii, USA, 2117-2125. https://doi.org/10.1109/CVPR.2017.106
[23]
Zhiqiang Lv, Jianbo Li, Chuanhao Dong, and Zhihao Xu. 2021. Deep learning in the COVID-19 epidemic: A deep model for urban traffic revitalization index. Data & Knowledge Engineering 135 (September 2021), 101912. https://doi.org/10.1016/j.datak.2021.101912
[24]
Xinlei Chen, Ross Girshick, Kaiming He, and Piotr Dollar. 2019. TensorMask: A Foundation for Dense Object Segmentation. In Proceedings of the IEEE/CVF international conference on computer vision. IEEE, Seoul, Korea, 2061-2069. https://doi.org/10.48550/arXiv.1903.12174
[25]
Cheng-Yang Fu, Mykhailo Shvets, and Alexander C. Berg. 2019. RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free. arXiv (January 2019), 1-11. https://arxiv.org/abs/1901.03353
[26]
Kai Chen, Jiangmiao Pang, Jiaqi Wang, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jianping Shi, Wanli Ouyang, Chen Change Loy, and Dahua Lin. 2019. Hybrid Task Cascade for Instance Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, Los Angeles CA, United States, 4974-4983. https://doi.org/10.1109/CVPR.2019.00511
[27]
Weicheng Kuo, Anelia Angelova, Jitendra Malik, and Tsung-Yi Lin. 2019. ShapeMask: Learning to Segment Novel Objects by Refining Shape Priors. In Proceedings of the IEEE/CVF international conference on computer vision. IEEE, Seoul, Korea, 9207-9216. https://doi.org/10.1109/ICCV.2019.00930
[28]
Zhiqiang Lv, Jianbo Li, Haoran Li, Zhihao Xu, and Yue Wang. 2021. Blind travel prediction based on obstacle avoidance in indoor scene. Wireless Communications and Mobile Computing 2021 (June 2021), 1-9. https://doi.org/10.1155/2021/5536386
[29]
D.Ravì, M.Bober, G.M.Farinella, M.Guarnera, and S.Battiato. 2016. Semantic segmentation of images exploiting DCT based features and random forest. Pattern Recognition 52 (April 2016), 260-273. https://doi.org/10.1016/j.patcog.2015.10.021
[30]
Aman R. Chadha, Pallavi P. Vaidya, and M. Mani Roja. 2011. Face recognition using discrete cosine transform for global and local features. In Proceedings of the 2011 International Conference on Recent Advancements in Electrical, Electronics and Control Engineering. IEEE, Sivakasi, India, 502-505. https://doi.org/10.1109/ICONRAEeCE.2011.6129742

Index Terms

  1. DCT-CenterMask: A Real-Time Instance Segmentation Network for Kidney Ultrasound Images

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICBDT '22: Proceedings of the 5th International Conference on Big Data Technologies
    September 2022
    454 pages
    ISBN:9781450396875
    DOI:10.1145/3565291
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 December 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. CenterMask
    2. Deformable convolution network
    3. Discrete cosine transform
    4. Real-time instance segmentation
    5. SimAM attention mechanism

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICBDT 2022

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 48
      Total Downloads
    • Downloads (Last 12 months)6
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media