Abstract
Convolutional Neural Network (CNN) based encoder and Recurrent Neural Network (RNN) based decoder architectures are widely used in the design of Handwritten Text Recognition (HTR) systems. Effective encoder representation plays a vital role in improving the performance of HTR systems. Squeeze and Excitation Networks, used in the context of image classification, object detection and scene classification, capture global inter-channel dependencies. ECA-Net learns channel attention via local Cross Channel Interaction (CCI). The current work proposes an encoder-decoder architecture for HTR which combines the benefits of local and global cross-channel attention for effective encoder representation. Experimental results on the IAM dataset show that there is an 8.98%, 3.24% reduction in Character Error Rate (CER) and an 8.98%, 3.45% reduction in Word Error Rate (WER) when the proposed module is applied to the state-of-the-art HTR Flor model and Puigcerver model respectively. The proposed work also presents a detailed error analysis at the character level on the IAM dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Neto, d.S., Flor, A., et al.: HTR-Flor: a deep learning system for offline handwritten text recognition. In: 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). IEEE (2020)
Joan. P.: Are multidimensional recurrent layers really necessary for handwritten text recognition?. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Vol. 1. IEEE (2017)
Plötz, T., Fink, G.A.: Markov models for offline handwriting recognition: a survey. Int. J. Doc. Anal. Recogn. (IJDAR) 12(4), 269–298 (2009)
Frinken, V., Peter, T., Fischer, A., Bunke, H., Do, T.-M.-T., Artieres, T.: Improved Handwriting Recognition by Combining Two Forms of Hidden Markov Models and a Recurrent Neural Network. In: Jiang, X., Petkov, N. (eds.) CAIP 2009. LNCS, vol. 5702, pp. 189–196. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03767-2_23
Bluche, T., Ney, H., Kermorvant, C.: Tandem HMM with convolutional neural network for handwritten word recognition. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE (2013)
Alex, G., Schmidhuber, J.: Offline handwriting recognition with multidimensional recurrent neural networks. Adv. Neural Inf. Proc. Syst. 21 (2008)
Théodore, B., Messina, B.: Gated convolutional recurrent neural networks for multilingual handwriting recognition. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Vol. 1. IEEE (2017)
Jie, H., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Qilong, W., et al.: Supplementary material for ‘ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, WA, USA (2020)
Marti, U-V., Horst Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recog. 5, 39–46 (2002)
Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. ACM (JACM) 21(1), 168–173 (1974)
Nafiz, A., Fatos, Y.V.: An overview of character recognition focused on off-line handwriting. IEEE Trans. Syst. Man Cyber. Part C (Appl. Rev.) 31, 216–233 (2001). https://doi.org/10.1109/5326.941845
Marti, U.-V., Bunke, H.: Using a statistical language model to improve the performance of an hmm-based cursive handwriting recognition system. IJPRAI. 15, 65–90 (2001). https://doi.org/10.1142/S0218001401000848
Sauvola, J., Seppänen, T., Haapakoski, S., Pietikäinen, M.: Adaptive Document Binarization. Pattern Recognition. 33. vol 1, pp. 147–152 (1997). https://doi.org/10.1109/ICDAR.1997.619831
de Zeeuw, F.: Slant Correction Using Histograms, Bachelor’s Thesis in Artificial Intelligence (2006)
Marti, U.-V., Bunke, H.: Handwritten sentence recognition. 3. vol 3, pp. 463–466 (2000). https://doi.org/10.1109/ICPR.2000.903584
Voigtlaender, P., Doetsch, P., Ney, H.: Handwriting Recognition with Large Multidimensional Long Short-Term Memory Recurrent Neural Networks. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 228–233 (2016) https://doi.org/10.1109/ICFHR.2016.0052
Vu, P., Christopher, K., Jérôme, L.: Dropout Improves Recurrent Neural Networks for Handwriting Recognition. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR (2014) https://doi.org/10.1109/ICFHR.2014.55
Krishnan, P., Dutta, K., Jawahar, C.V.: Word Spotting and Recognition Using Deep Embedding. 1–6 (2018). https://doi.org/10.1109/DAS.2018.70
Alex, G., Santiago, F., Faustino, G., Jürgen, S.: Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural ’networks. In: ICML 2006 - Proceedings of the 23rd International Conference on Machine Learning, pp. 369–376 (2006). https://doi.org/10.1145/1143844.1143891
Baoguang, S., Xiang, B., Cong,Y.: An End-to-End trainable neural network for image-based sequence recognition and its application to scene text recognition. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2015). https://doi.org/10.1109/TPAMI.2016.2646371
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Shashank, B.N., Nagesh Bhattu, S., Sri Phani Krishna, K. (2023). Improvising the CNN Feature Maps Through Integration of Channel Attention for Handwritten Text Recognition. In: Gupta, D., Bhurchandi, K., Murala, S., Raman, B., Kumar, S. (eds) Computer Vision and Image Processing. CVIP 2022. Communications in Computer and Information Science, vol 1777. Springer, Cham. https://doi.org/10.1007/978-3-031-31417-9_37
Download citation
DOI: https://doi.org/10.1007/978-3-031-31417-9_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31416-2
Online ISBN: 978-3-031-31417-9
eBook Packages: Computer ScienceComputer Science (R0)