Skip to main content
Log in

A Novel Multi-head Attention and Long Short-Term Network for Enhanced Inpainting of Occluded Handwriting

  • Correspondence
  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

In the domain of handwritten character recognition, inpainting occluded offline characters is essential. Relying on the remarkable achievements of transformers in various tasks, we present a novel framework called “Enhanced Inpainting with Multi-head Attention and stacked long short-term memory (LSTM) Network” (E-Inpaint). This framework aims to restore occluded offline handwriting while capturing its online signal counterpart, enriched with dynamic characteristics. The proposed approach employs Convolutional Neural Network (CNN) and Multi-Layer Perceptron (MLP) in order to extract essential hidden features from the handwriting image. These features are then decoded by stacked LSTM with Multi-head Attention, achieving the inpainting process and generating the online signal corresponding to the uncorrupted version. To validate our work, we utilize the recognition system Beta-GRU on Latin, Indian, and Arabic On/Off dual datasets. The obtained results show the efficiency of using stacked-LSTM network with multi-head attention, enhancing the quality of the restored image and significantly improving the recognition rate using the innovative Beta-GRU system. Our research mainly highlights the potential of E-Inpaint in enhancing handwritten character recognition systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

No datasets were generated or analysed during the current study.

References

  1. Chen Y, Xia R, Yang K, Zou K. DNNAM: image inpainting algorithm via deep neural networks and attention mechanism. Appl Soft Comput. 2024;154:111392.

    Article  MATH  Google Scholar 

  2. Chan TF, Shen J, Zhou HM. A total variation wavelet inpainting model with multilevel fitting parameters. In: Advanced signal processing algorithms, architectures, and implementations XVI, vol. 6313. SPIE; 2006. pp. 108–15.

  3. Arias P, Caselles V, Sapiro GA. Variational framework for non-local image inpainting. Proc. EMMCVPR'09. 2009;345–358. https://doi.org/10.1007/978-3-642-03641-5_26

  4. Shibata T, Iketani A, Senda Sh. Fast and structure-preserving inpainting based on probabilistic structure estimation. In: MVA 2011 IAPR conference on machine vision applications. Nara, JAPAN; 2011. pp. 22–25.

  5. Potapov A, Scherbakov O, Zhdanov I. Practical algorithmic probability: an image inpainting example. In Sixth International Conference on Machine Vision (ICMV 2013). 2013;(9067):240–244. SPIE.

  6. Huang JB, Kang SB, Ahuja N, Kopf J. Image completion using planar structure guidance. ACM Transactions on graphics (TOG). 2014;33(4):1–10.

    MATH  Google Scholar 

  7. Hu Y, Zhang D, Ye J, Li X, He X. Fast and accurate matrix completion via truncated nuclear norm regularization. IEEE Trans Pattern Anal Mach Intell. 2012;35(9):2117–30.

    Article  MATH  Google Scholar 

  8. Sai Hareesh A, Chandrasekaran V. Exemplar-based color image inpainting: a fractional gradient function approach. Pattern Anal Appl. 2014;17(2):389–99.

    Article  MathSciNet  MATH  Google Scholar 

  9. Xi X, Wang F, Liu Y. Improved Criminisi algorithm based on a new priority function with the gray entropy. Ninth Int Conf Comput Intell Secur. 2013;2013214–218, https://doi.org/10.1109/CIS.2013.52.

  10. Song G, Li J, Wang Z. Occluded offline handwritten Chinese character inpainting via generative adversarial network and self-attention mechanism. Neurocomputing. 2020;415:146–56.

    Article  Google Scholar 

  11. LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324.

    Article  MATH  Google Scholar 

  12. Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA. Context encoders: feature learning by inpainting. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. pp. 2536–44.

  13. Xie J, Xu L, Chen E. Image denoising and inpainting with deep neural networks. Adv Neural Inf Process Syst. 2012;25.

  14. Rabhi B, Elbaati A, Boubaker H, et al. Temporal order and pen velocity recovery for character handwriting based on sequence-to-sequence with attention mode. TechRxiv. February 12, 2021. https://doi.org/10.36227/techrxiv.13902650.v1

  15. Shcherbakov O, Batishcheva V. Image inpainting based on stacked autoencoders. In Journal of Physics: Conference Series. 2014;536(1):012020. IOP Publishing.

  16. Mairal J, Elad M, Sapiro G. Sparse representation for color image restoration. IEEE Trans Image Process. 2007;17(1):53–69.

    Article  MathSciNet  MATH  Google Scholar 

  17. Xu L, Ren JS, Liu C, Jia J. Deep convolutional neural network for image deconvolution. Adv Neural Inf Process Syst. 2014;27.

  18. Iizuka S, Simo-Serra E, Ishikawa H. Globally and locally consistent image completion. ACM Trans Graphics (ToG). 2017;36(4):1–14.

    Article  MATH  Google Scholar 

  19. Li J, Song G, Zhang M. Occluded offline handwritten Chinese character recognition using deep convolutional generative adversarial network and improved GoogLeNet. Neural Comput Appl. 2020;32(9):4805–19.

    Article  MATH  Google Scholar 

  20. Yuan K, Guo S, Liu Z, Zhou A, Yu F, Wu W. Incorporating convolution designs into visual transformers. In: Proceedings of the IEEE/CVF international conference on computer vision. 2021. pp. 579–88.

  21. Hamdi Y, Boubaker H, Rabhi B, Ouarda W, Alimi AM. Hybrid architecture based on RNN-SVM for multilingual online handwriting recognition using beta-elliptic and CNN models. 2021. TechRxiv. Preprint. https://doi.org/10.36227/techrxiv.13903661.v3.

  22. Rabhi B, Elbaati A, Boubaker H, Hamdi Y, Hussain A, Alimi AM. Multi-lingual character handwriting framework based on an integrated deep learning based sequence-to-sequence attention model. Memetic Computing. 2021;13:459–75. https://doi.org/10.1007/s12293-021-00345-6.

    Article  Google Scholar 

  23. Rabhi B, Elbaati A, Hamdi Y, Alimi A. Handwriting recognition based on temporal order restored by the end-to-end system. International Conference on Document Analysis and Recognition (ICDAR), Sydney, NSW, Australia. 2019;1231–1236. https://doi.org/10.1109/ICDAR.2019.00199

  24. Rabhi B, Elbaati A, Boubaker H, Pal U, Alimi AM. Multi-lingual handwriting recovery framework based on convolutional denoising autoencoder with attention model. Multimed Tools Appl. 2024;83(8):22295–326.

    Article  Google Scholar 

  25. Rabhi B, Elbaati A, Hamdani TM, Alimi AM. ASAR 2021 competition on online signal restoration using Arabic handwriting Dhad dataset. In Document Analysis and Recognition–ICDAR 2021 Workshops: Lausanne, Switzerland, September 5–10, Proceedings, Part I 16. Springer International Publishing. 2021;366-378. https://doi.org/10.1007/978-3-030-86198-8_26.

  26. Wu H, Xiao B, Codella N, Liu M, Dai X, Yuan L, Zhang L. Cvt: introducing convolutions to vision transformers. In: Proceedings of the IEEE/CVF international conference on computer vision. 2021. pp. 22–31.

  27. Yahia H, Rabhi B, Dhieb T, Alimi AM. Multi-head self-attention and BGRU for online Arabic grapheme text segmentation. 2023 International Conference on Cyberworlds (CW), Sousse, Tunisia. 2023;78–85. https://doi.org/10.1109/CW58918.2023.00021.

  28. Viard-Gaudin C, Lallican PM, Knerr S, Binter P. The ireste on/off (ironoff) dual handwriting database. In Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR. 1999;455–458. https://doi.org/10.1109/ICDAR.1999.791823.

  29. Hamdi Y, Boubaker H, Alimi AM. Data augmentation using geometric, frequency, and beta modeling approaches for improving multi-lingual online handwriting recognition. IJDAR. 2021;24:283–98. https://doi.org/10.1007/s10032-021-00376-2.

    Article  MATH  Google Scholar 

  30. Hamdi Y, Boubaker H, Dhieb T, Elbaati A, Alimi AM. Hybrid DBLSTM-SVM based beta-elliptic-CNN models for online Arabic characters recognition. In 2019 International conference on document analysis and recognition (ICDAR).2019;545–550. IEEE. https://doi.org/10.1109/ICDAR.2019.00093.

  31. Hamdi Y, Boubaker H, Rabhi B, Abdulrahman MQ, Alharithi FS, Almutiry O, Dhahri H, Alimi AM. Deep learned BLSTM for online handwriting modeling simulating the Beta-Elliptic approach. Eng Sci Technol, an International Journal. 2022;35. https://doi.org/10.1016/j.jestch.2022.101215.

  32. Lu C, Tang J, Yan S, Lin Z. Generalized nonconvex nonsmooth low-rank minimization. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2014;4130–4137. https://doi.org/10.1109/CVPR.2014.526.

  33. Wang F, Tian S, Yu L, Liu J, Wang J, Li K, Wang Y. TEDT: transformer-based encoding–decoding translation network for multimodal sentiment analysis. Cogn Comput. 2023;15(1):289–303.

    Article  MATH  Google Scholar 

  34. Huertas-García Á, Martín A, Huertas-Tato J, Camacho D. Exploring dimensionality reduction techniques in multilingual transformers. Cogn Comput. 2023;15(2):590–612.

    Article  MATH  Google Scholar 

  35. Rahal N, Tounsi M, Hussain A, Alimi AM. Deep sparse auto-encoder features learning for Arabic text recognition. IEEE Access. 2021;9:18569–84.

    Article  Google Scholar 

  36. Dhahri H, Rabhi B, Chelbi S, Almutiry O, Mahmood A, Alimi AM. Automatic detection of COVID-19 using a stacked denoising convolutional autoencoder. Comput, Mater Continua. 2021;69(3):3259.

  37. Rabhi B, Dhahri H, Alimi AM, Alturki FA. Grey wolf optimizer for training Elman neural network. In Proceedings of the 16th International Conference on Hybrid Intelligent Systems (HIS 2016). 2017;380–390. Springer International Publishing.

  38. Han K, You W, Deng H, et al. LanT: finding experts for digital calligraphy character restoration. Multimed Tools Appl. 2024;83:64963–86. https://doi.org/10.1007/s11042-023-17844-y.

    Article  MATH  Google Scholar 

Download references

Funding

This study was funded by the Ministry of Higher Education and Scientific Research of Tunisia (grant number LR11ES4).

Author information

Authors and Affiliations

Authors

Contributions

All authors reviewed the manuscript Besma Rabhi: wrote the main manuscript and implemented the coding step Abdelkarim Elbaati: examined the architecture Yahia Hamdi: prepared figures Habib Dhahri: examined the experimental results Umapada Pal:reviewed the manuscript Habib Chabchoub:reviewed the manuscript Khmaies Ouahada:reviewed the manuscript Adel M. Alimi:supervisor, reviewed the manuscript.

Corresponding author

Correspondence to Besma Rabhi.

Ethics declarations

Ethics Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Competing Interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rabhi, B., Elbaati, A., Hamdi, Y. et al. A Novel Multi-head Attention and Long Short-Term Network for Enhanced Inpainting of Occluded Handwriting. Cogn Comput 17, 6 (2025). https://doi.org/10.1007/s12559-024-10382-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12559-024-10382-1

Keywords