Abstract
U-net is one of the common segmentation frameworks in the field of medical semantic segmentation. By utilizing skip-connection and multi-scale feature fusion tricks, U-net shows excellent segmentation performance. However, traditional U-net architectures have a large number of parameters, which may make image processing slow. In this paper, a fast and effective glottis segmentation method based on a lightweight U-net is designed to solve the problem of throat glottis segmentation, which ensures the accuracy of the segmentation while taking into account the processing speed. We simplify the model structure and introduce a self-attention module to improve segmentation performance. Besides, we also propose a hybrid loss for glottis region segmentation. Compared with the mainstream segmentation networks such as U-net, PSPNET, Segnet, and NestedU-net, the total number of model parameters we design is less, and the proposed method can achieve higher dice score, mIoU, FWIoU, and mPA.
This work was supported in part by the National Natural Science Foundation of China under Grant No. 61971368, in part by the Natural Science Foundation of Fujian Province of China No. 2019J01003, in part by the National Natural Science Foundation of China under Grant No. U20A20162, in part by the National Natural Science Foundation of China (grant number 61731012).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chen, X., Bless, D., Yan, Y.: A segmentation scheme based on Rayleigh distribution model for extracting glottal waveform from high-speed laryngeal images. In: 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, pp. 6269–6272. IEEE (2005)
Mendez, A., Garcia, B., Ruiz, I., Iturricha, I.: Glottal area segmentation without initialization using Gabor filters. In: 2008 IEEE International Symposium on Signal Processing and Information Technology, pp. 18–22. IEEE (2008)
Gutiérrez-Arriola, J.M., Osma-Ruiz, V., Sáenz-Lechón, N., Godino-Llorente, J.I., Fraile, R., Arias-Londoño, J.D.: Objective measurements to evaluate glottal space segmentation from laryngeal images. In: 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 5396–5399. IEEE (2012)
Ammar-Badri, H., Benazza-Benyahia, A.: Statistical glottal segmentation of videoendoscopic images using geodesic active contours. In: 2014 1st International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), pp. 198–203. IEEE (2014)
Gloger, O., Lehnert, B., Schrade, A., Völzke, H.: Fully automated glottis segmentation in endoscopic videos using local color and shape features of glottal regions. IEEE Trans. Biomed. Eng. 62(3), 795–806 (2014)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440. IEEE (2015)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: UNet++: a nested U-Net architecture for medical image segmentation. In: Stoyanov, D., et al. (eds.) DLMIA/ML-CDS -2018. LNCS, vol. 11045, pp. 3–11. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00889-5_1
Oktay, O., et al.: Attention U-Net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999 (2018)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988. IEEE (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Huang, X., Deng, J., Wang, X., Zhuang, P., Huang, L., Zhao, C. (2022). Automatic Glottis Segmentation Method Based on Lightweight U-net. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13535. Springer, Cham. https://doi.org/10.1007/978-3-031-18910-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-18910-4_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18909-8
Online ISBN: 978-3-031-18910-4
eBook Packages: Computer ScienceComputer Science (R0)