Skip to main content

Improvising the CNN Feature Maps Through Integration of Channel Attention for Handwritten Text Recognition

  • Conference paper
  • First Online:
Computer Vision and Image Processing (CVIP 2022)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1777))

Included in the following conference series:

  • 397 Accesses

Abstract

Convolutional Neural Network (CNN) based encoder and Recurrent Neural Network (RNN) based decoder architectures are widely used in the design of Handwritten Text Recognition (HTR) systems. Effective encoder representation plays a vital role in improving the performance of HTR systems. Squeeze and Excitation Networks, used in the context of image classification, object detection and scene classification, capture global inter-channel dependencies. ECA-Net learns channel attention via local Cross Channel Interaction (CCI). The current work proposes an encoder-decoder architecture for HTR which combines the benefits of local and global cross-channel attention for effective encoder representation. Experimental results on the IAM dataset show that there is an 8.98%, 3.24% reduction in Character Error Rate (CER) and an 8.98%, 3.45% reduction in Word Error Rate (WER) when the proposed module is applied to the state-of-the-art HTR Flor model and Puigcerver model respectively. The proposed work also presents a detailed error analysis at the character level on the IAM dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Neto, d.S., Flor, A., et al.: HTR-Flor: a deep learning system for offline handwritten text recognition. In: 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). IEEE (2020)

    Google Scholar 

  2. Joan. P.: Are multidimensional recurrent layers really necessary for handwritten text recognition?. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Vol. 1. IEEE (2017)

    Google Scholar 

  3. Plötz, T., Fink, G.A.: Markov models for offline handwriting recognition: a survey. Int. J. Doc. Anal. Recogn. (IJDAR) 12(4), 269–298 (2009)

    Article  Google Scholar 

  4. Frinken, V., Peter, T., Fischer, A., Bunke, H., Do, T.-M.-T., Artieres, T.: Improved Handwriting Recognition by Combining Two Forms of Hidden Markov Models and a Recurrent Neural Network. In: Jiang, X., Petkov, N. (eds.) CAIP 2009. LNCS, vol. 5702, pp. 189–196. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03767-2_23

    Chapter  Google Scholar 

  5. Bluche, T., Ney, H., Kermorvant, C.: Tandem HMM with convolutional neural network for handwritten word recognition. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE (2013)

    Google Scholar 

  6. Alex, G., Schmidhuber, J.: Offline handwriting recognition with multidimensional recurrent neural networks. Adv. Neural Inf. Proc. Syst. 21 (2008)

    Google Scholar 

  7. Théodore, B., Messina, B.: Gated convolutional recurrent neural networks for multilingual handwriting recognition. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). Vol. 1. IEEE (2017)

    Google Scholar 

  8. Jie, H., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)

    Google Scholar 

  9. Qilong, W., et al.: Supplementary material for ‘ECA-Net: efficient channel attention for deep convolutional neural networks. In: Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, WA, USA (2020)

    Google Scholar 

  10. Marti, U-V., Horst Bunke, H.: The IAM-database: an English sentence database for offline handwriting recognition. Int. J. Doc. Anal. Recog. 5, 39–46 (2002)

    Google Scholar 

  11. Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. ACM (JACM) 21(1), 168–173 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  12. Nafiz, A., Fatos, Y.V.: An overview of character recognition focused on off-line handwriting. IEEE Trans. Syst. Man Cyber. Part C (Appl. Rev.) 31, 216–233 (2001). https://doi.org/10.1109/5326.941845

  13. Marti, U.-V., Bunke, H.: Using a statistical language model to improve the performance of an hmm-based cursive handwriting recognition system. IJPRAI. 15, 65–90 (2001). https://doi.org/10.1142/S0218001401000848

    Article  Google Scholar 

  14. Sauvola, J., Seppänen, T., Haapakoski, S., Pietikäinen, M.: Adaptive Document Binarization. Pattern Recognition. 33. vol 1, pp. 147–152 (1997). https://doi.org/10.1109/ICDAR.1997.619831

  15. de Zeeuw, F.: Slant Correction Using Histograms, Bachelor’s Thesis in Artificial Intelligence (2006)

    Google Scholar 

  16. Marti, U.-V., Bunke, H.: Handwritten sentence recognition. 3. vol 3, pp. 463–466 (2000). https://doi.org/10.1109/ICPR.2000.903584

  17. Voigtlaender, P., Doetsch, P., Ney, H.: Handwriting Recognition with Large Multidimensional Long Short-Term Memory Recurrent Neural Networks. In: 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 228–233 (2016) https://doi.org/10.1109/ICFHR.2016.0052

  18. Vu, P., Christopher, K., Jérôme, L.: Dropout Improves Recurrent Neural Networks for Handwriting Recognition. In: Proceedings of International Conference on Frontiers in Handwriting Recognition, ICFHR (2014) https://doi.org/10.1109/ICFHR.2014.55

  19. Krishnan, P., Dutta, K., Jawahar, C.V.: Word Spotting and Recognition Using Deep Embedding. 1–6 (2018). https://doi.org/10.1109/DAS.2018.70

  20. Alex, G., Santiago, F., Faustino, G., Jürgen, S.: Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural ’networks. In: ICML 2006 - Proceedings of the 23rd International Conference on Machine Learning, pp. 369–376 (2006). https://doi.org/10.1145/1143844.1143891

  21. Baoguang, S., Xiang, B., Cong,Y.: An End-to-End trainable neural network for image-based sequence recognition and its application to scene text recognition. In: IEEE Transactions on Pattern Analysis and Machine Intelligence (2015). https://doi.org/10.1109/TPAMI.2016.2646371

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Nagesh Bhattu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shashank, B.N., Nagesh Bhattu, S., Sri Phani Krishna, K. (2023). Improvising the CNN Feature Maps Through Integration of Channel Attention for Handwritten Text Recognition. In: Gupta, D., Bhurchandi, K., Murala, S., Raman, B., Kumar, S. (eds) Computer Vision and Image Processing. CVIP 2022. Communications in Computer and Information Science, vol 1777. Springer, Cham. https://doi.org/10.1007/978-3-031-31417-9_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-31417-9_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-31416-2

  • Online ISBN: 978-3-031-31417-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics