Abstract
A common limitation to almost all state-of-the-art techniques for automated polyp detection and delineation is that they are based on still-frame analysis (Wang et al. 2018; Urban et al. 2018; Shin et al. 2018; Mohammed et al. 2018). Colonoscopy, however, is a video-based modality and an endoscopist will always use the contextual information from previous frames to make an accurate decision about the potential presence of a polyp. Recent developments in semantic segmentation of videos in non-medical applications (Fayyaz et al. 2016; Valipour et al. 2017) show that including temporal features in a CNN can increase its performance and also yield more consistent results over time. Recurrent neural networks (RNNs) are a commonly used concept for sequence modeling and essentially allow neural networks to retain information over time. Long short-term memory (LSTM) models are a type of RNNs that can model longer time dependencies than traditional RNNs and it is a convolutional variant of these LSTM models (Xingjian et al. 2015) that is used in this chapter. The latter can not only encode temporal features but can simultaneously incorporate spatial features into one single layer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chen, L.-C., et al. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. arXiv preprint arXiv:1802.02611.
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., & Adam, H. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. arXiv preprint arXiv:1802.02611.
Eelbode, T., Demedts, I., Bisschops, R., Roelandt, P., Hassan, C., Coron, E., Bhandari, P., Neumann, H., Pech, O., Repici, A., et al. (2019). Tu1931 incorporation of temporal information in a deep neural network improves performance level for automated polyp detection and delineation. Gastrointestinal Endoscopy, 89(6), AB618–AB619.
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., Zisserman, A., Everingham, M., Van Gool Leuven, L. K., CKI Williams, B., Winn, J., & Zisserman, A. (2010). The PASCAL visual object classes (VOC) challenge. International Journal of Computer Vision, 88, 303–338.
Fayyaz, M., et al. (2016). Stfcn: Spatio-temporal fcn for semantic video segmentation. arXiv preprint arXiv:1608.05971.
Krähenbühl, P., & Koltun, V. (2011). Efficient inference in fully connected crfs with gaussian edge potentials. In Advances in neural information processing systems (pp. 109–117).
Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision (pp. 2980–2988).
Mohammed, A., Yildirim, S., Farup, I., Pedersen, M., & Hovde, Ø. (2018). Y-net: A deep convolutional neural network for polyp detection. arXiv preprint arXiv:1806.01907.
Shelhamer, E., Rakelly, K., Hoffman, J., & Darrell, T. (2016). Clockwork convnets for video semantic segmentation. In European Conference on Computer Vision (pp. 852–868). Springer.
Shin, Y., Qadir, H. A., Aabakken, L., Bergsland, J., & Balasingham, I. (2018). Automatic colon polyp detection using region based deep CNN and post learning approaches. IEEE Access, 6, 40950–40962.
Urban, G., Tripathi, P., Alkayali, T., Mittal, M., Jalali, F., Karnes, W., et al. (2018). Deep learning localizes and identifies polyps in real time with 96% accuracy in screening colonoscopy. Gastroenterology, 155(4), 1069–1078.e8.
Valipour, S., Siam, M., Jagersand, M., & Ray, N. (2017). Recurrent fully convolutional networks for video segmentation. In 2017 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 29–36). IEEE.
Wang, P., Xiao, X., Glissen Brown, J. R., Berzin, T. M., Tu, M., Xiong, F., Hu, X., Liu, P., Song, Y., Zhang, D., Yang, X., Li, L., He, J., Yi, X., Liu, J., & Liu, X. (2018). Development and validation of a deep-learning algorithm for the detection of polyps during colonoscopy. Nature Biomedical Engineering, 2(10), 741–748.
Xingjian, S., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W.-K., & Woo, W.-c. (2015). Convolutional LSTM network: A machine learning approach for precipitation nowcasting. In Advances in neural information processing systems (pp. 802–810).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Eelbode, T., Sinonquel, P., Bisschops, R., Maes, F. (2021). Convolutional LSTM. In: Bernal, J., Histace, A. (eds) Computer-Aided Analysis of Gastrointestinal Videos. Springer, Cham. https://doi.org/10.1007/978-3-030-64340-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-64340-9_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64339-3
Online ISBN: 978-3-030-64340-9
eBook Packages: Computer ScienceComputer Science (R0)