Abstract
Deep Neural Networks are susceptible to label noise, which can lead to poor generalization. Degradation of labels in a Histopathology segmentation dataset can be especially caused due to the large inter-observer variability between expert annotators. Thus, obtaining a clean dataset may not be feasible. We address this by using Knowledge Distillation as a learned Label Smoothening Regularizer which has a denoising effect when training on a noisy dataset. To show the effectiveness of our approach, an evaluation is performed on the Gleason Challenge dataset which has high discordance between expert pathologists. Based on the reported experiments, we show that the distilled model achieves significant performance gain when training on the noisy dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization (2018)
Arpit, D., et al.: A closer look at memorization in deep networks. arXiv preprint arXiv:1706.05394 (2017)
Lukasik, M., Bhojanapalli, S., Menon, A.K., Kumar, S.: Does label smoothing mitigate label noise? arXiv preprint arXiv:2003.02819 (2020)
Yuan, L., Tay, F.E.H., Li, G., Wang, T., Feng, J.: Revisiting knowledge distillation via label smoothing regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3903–3911 (2020)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)
Ren, M., Zeng, W., Yang, B., Urtasun, R.: Learning to reweight examples for robust deep learning. arXiv preprint arXiv:1803.09050 (2018)
Wang, Y., et al.: Iterative learning with open-set noisy labels. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8688–8696 (2018)
Jiang, L., Zhou, Z., Leung, T., Li, L.J., Fei-Fei, L.: Mentornet: learning data-driven curriculum for very deep neural networks on corrupted labels. In: International Conference on Machine Learning, pp. 2304–2313 (2018)
Han, B., et al.: Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Advances in Neural Information Processing Systems, pp. 8527–8537 (2018)
Li, Y., Yang, J., Song, Y., Cao, L., Luo, J., Li, L.J.: Learning from noisy labels with distillation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1910–1918 (2017)
Zhang, Z., Zhang, H., Arik, S.O., Lee, H., Pfister, T.: Distilling effective supervision from severe label noise. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9294–9303 (2020)
Ghosh, A., Kumar, H., Sastry, P.S.: Robust loss functions under label noise for deep neural networks. arXiv preprint arXiv:1712.09482 (2017)
Wang, X., Hua, Y., Kodirov, E., Robertson, N.M.: Imae for noise-robust learning: mean absolute error does not treat examples equally and gradient magnitude’s variance matters. arXiv preprint arXiv:1903.12141 (2019)
Wang, G., et al.: A noise-robust framework for automatic segmentation of covid-19 pneumonia lesions from CT images. IEEE Trans. Med. Imaging 39(8), 2653–2663 (2020)
Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: a survey. arXiv preprint arXiv:2006.05525 (2020)
Xie, J., Shuai, B., Hu, J.F., Lin, J., Zheng, W.S.: Improving fast segmentation with teacher-student learning. arXiv preprint arXiv:1810.08476 (2018)
Liu, Y., Chen, K., Liu, C., Qin, Z., Luo, Z., Wang, J.: Structured knowledge distillation for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2604–2613 (2019)
Sarfraz, F., Arani, E., Zonooz, B.: Knowledge distillation beyond model compression. arXiv preprint arXiv:2007.01922 (2020)
Gleason 2019 Challenge (2020). Accessed 10 Oct 2020
Nagpal, K., et al.: Development and validation of a deep learning algorithm for improving gleason scoring of prostate cancer. NPJ Dig. Med. 2(1), 1–10 (2019)
Warfield, S.K., Zou, K.H., Wells, W.M.: Simultaneous truth and performance level estimation (staple): an algorithm for the validation of image segmentation. IEEE Trans. Med. Imaging 23(7), 903–921 (2004)
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Nir, G., et al.: Automatic grading of prostate cancer in digitized histopathology images: Learning from multiple experts. Med. Image Anal. 50, 167–180 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Raipuria, G., Bonthu, S., Singhal, N. (2021). Noise Robust Training of Segmentation Model Using Knowledge Distillation. In: Del Bimbo, A., et al. Pattern Recognition. ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science(), vol 12661. Springer, Cham. https://doi.org/10.1007/978-3-030-68763-2_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-68763-2_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-68762-5
Online ISBN: 978-3-030-68763-2
eBook Packages: Computer ScienceComputer Science (R0)