Multi-modal text recognition and encryption in scanned document images

Kayani, Maemoona; Ghafoor, Abdul; Riaz, M. Mohsin

doi:10.1007/s11227-022-04912-7

Multi-modal text recognition and encryption in scanned document images

Published: 09 December 2022

Volume 79, pages 7916–7936, (2023)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Maemoona Kayani¹,
Abdul Ghafoor¹ &
M. Mohsin Riaz²

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Many military and business documents such as memorandums, invoices, medical records and bills among others are transmitted over the network in the form of images. These scanned document images may contain some private information that must not be accessed by third party or unauthorized users. Most of the encryption techniques encrypt the complete image but only parts of it require protection, which increases the computational and overhead communication cost. In this paper, we introduce a novel technique for selective multi-modal text recognition and encryption in scanned document images to overcome the issue of security and efficiency. Maximally stable extremal regions (MSER) are used in combination with two filtering techniques to automatically detect regions of interest (ROI) that contain text. Proposed technique combines optical character recognition (OCR) and natural language processing (NLP) in text recognition thus taking advantage of different modalities of data. Multi-modal recognized text is then encrypted using advanced encryption standard in cipher-block chaining mode (AES-CBC) and hybrid chaotic map. Simulation results illustrate the reliability and efficiency of proposed encryption scheme based on ROI. Moreover, security analysis exhibits the robustness of proposed encryption technique against linear and differential cryptographic attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Analytical Survey of Image Encryption Techniques Used in Various Application Domains

Grey-Level Text Encryption Using Chaotic Maps and Arnold Transform

Extraction and Identification of Manipuri and Mizo Texts from Scene and Document Images

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availibility statement

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Lin CH, Hu GH, Chan CY, Yan JJ (2021) Chaos-Based Synchronized Dynamic Keys and Their Application to Image Encryption with an Improved AES Algorithm. Appl Sci 11(3):1329
Article Google Scholar
Gaur G, Sharma JB, Tharani L (2021) Verilog Implementation of Biometric-Based Transmission of Fused Images Using Data Encryption Standards Algorithm. Nanoelectron Circuits Commun Syst 692:456–457
Google Scholar
Al-kadei FHM, Mardan HA, Minas NA (2020) Speed Up Image Encryption by Using RSA Algorithm. In:6th International Conference on Advanced Computing and Communication Systems
Zhang X, Wang X (2018) Digital image encryption algorithm based on elliptic curve public cryptosystem. IEEE Access 6:70025–70034
Article Google Scholar
Shor PW (1997) Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. Soc Ind Appl Math 26(5):184–1509
MathSciNet MATH Google Scholar
Anthimopoulos M, Gatos B, Pratikakis I (2007) Multiresolution text detection in video frames, In: International Conference Signal Processing, Pattern Recognition and Applications, pp. 14–17
Bhateja V, Devi S, Urooj S (2013) An evaluation of edge detection algorithms for mammographic calcifications, In: Proceedings of the Fourth International Conference on Signal and Image Processing, pp. 487–498
Shivakumara P, Phan TQ, Tan CL (2009) Video text detection based on filters and edge features, In: IEEE International Conference on Multimedia and Expo, pp. 514–517
Jeong M, Jo KH (2015) Multi language text detection using fast stroke width transform. In: 21st Korea-Japan Joint Workshop on Frontiers of Computer Vision, pp. 1–4
Aradhya VNM, Pavithra MS (2013) An application of k-means clustering for improving video text detection. Adv Intell Syst Comput 182:41–47
Google Scholar
Wu W, Chen X, Yang J (2005) Detection of text on road signs from video. IEEE Trans Intell Transp Syst 6(4):378–390
Article Google Scholar
Phan TQ, Shivakumara, Tan CL (2009) A Laplacian method for video text detection. In: 10th International Conference on Document Analysis and Recognition, pp. 66–70
Ma XH, Ng WW, Chan PP, Yeung DS (2010) Video text detection and localization based on localized generalization error model, In: International Conference on Machine Learning Cybernet, vol 4, pp 2161–2166
Matas J, Chuma O, Urban M, Pajdla T (2004) Robust wide-baseline stereo from maximally stable extremal regions. Image Vis Comput 22(10):761–767
Article Google Scholar
He T, Huang W, Qiao Y, Yao J (2016) Text-attentional convolutional neural network for scene text detection. IEEE Trans Image Process 25(6):2529–2541
Article MathSciNet MATH Google Scholar
Zhou X, Yao C, Wen H, Wang Y Y, Zhou S S, He W, Liang J (2017) EAST: an efficient and accurate scene text detector. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 5551–5560
Jaderberg M, Simonyan K, Vedaldi A, Zisserman A (2016) Reading text in the wild with convolutional neural networks. Int J Comput Vis 116(1):1–20
Article MathSciNet Google Scholar
Ji Z, Wang J, Su YT (2009) Text detection in video frames using hybrid features. In: International Conference on Machine Learning Cybernet 1:318–322
Zhen W, Zhiqiang W (2009) A comparative study of feature selection for SVM in video text detection. In: Second International Symposium on Computational Intelligence and Design, 2:552–556
Miao G, Huang Q, Jiang S, Gao W (2008) Coarse-to-fine video text detection, In: IEEE International Conference on Multimedia and Expo, pp 569–572
Li X, Wang W, Jiang S, Huang Q, Gao W (2008) Fast and effective text detection. In: 15th IEEE International Conference on Image Processing, pp 969–972
Zhao Y, Lu T, Liao W (2011) A robust color-independent text detection method from complex videos. In: International Conference on Document Analysis and Recognition, pp 374–378
Khare VV, Shivakumara P, Raveendran P (2015) A new histogram oriented moments descriptor for multi-oriented moving text detection in video’’. Exp Syst Appl 42(21):7627–7640
Article Google Scholar
Mousavirad SJ, Ebrahimpour-Komleh H (2017) Multilevel image thresholding using entropy of histogram and recently developed population-based metaheuristic algorithms. Evolut Intell 10(1–2):45–75
Article Google Scholar
Bhunia AK, Kumar G, Roy PP, Balasubramanian R, Pal U (2018) Text recognition in scene image and video frame using color channel selection. Multimed Tools Appl 77(7):8551–8578
Article Google Scholar
Chen H, Tsai SS, Schroth G, Chen DM, Grzeszczuk R, Girod B (2011) Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: 18th IEEE International Conference on Image Processing, pp 2609–2612
“MATLAB OCR-function,” https://se.mathworks.com/help/vision/ref/ocr.html Accessed: 2021-03-01
Alginahi Y (2010) Preprocessing techniques in character recognition. Character Recognit 1:1–19
Google Scholar
Campos T, Rakesh B, Varma M (2009) Character recognition in natural images. VISAPP 2(7)
Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 2963–2970
Nadeau D, Sekine S (2007) A survey of named entity recognition and classification. Lingvist Investig 30:3–26
Article Google Scholar
Katti AR, Reisswig C, Guder C, Brarda S, Bickel S, Hohne J, Faddoul JB (2018) Towards understanding 2d documents. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp 4459–4469
Denk T, Reisswig C (2019) Contextualized embedding for 2d document representation and understanding, arxiv:1909.04948
Xu Y, Li M, Cui L, Huang S, Wei F, Zhou M (2020) Pre-training of text and layout for document image understanding. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Devlin J, Chang MW, Lee K, Toutanova K (2019) Bert: pre-training of deep bidirectional transformers for language understanding. NAACL-HLT 1:4171–4186
Google Scholar
Yang X, Yumer E, Asente P, Kraley M, Kifer D, Giles CL (2017) Learning to extract semantic structure from documents using multi-modal fully convolutional neural network. In: IEEE Conference on Computer Vision and Pattern Recognition
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. NAACL
Shivakumara P, Phan TQ, Tan CL (2010) A laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell 33(2):412–419
Article Google Scholar
Daemen J, Rijmen V (1999) “AES Proposal: Rijndael”
Hua Z, Zhou Y, Huang H (2019) Cosine-transform-based chaotic system for image encryption. Inf Sci 480:403–419
Article Google Scholar
Arif J, Khan MA, Ghaleb B, Ahmad J, Munir A, Rashid U, Al-Dubai AY (2022) A novel chaotic permutation-substitution image encryption scheme based on logistic map and random substitution. IEEE Access 10:12966–12982
Article Google Scholar
Alvarez G, Li S (2006) Some basic cryptographic requirements for chaos-based cryptosystems. Int J Bifurc Chaos 16(8):2129–2151
Article MathSciNet MATH Google Scholar
Tsafack N, Kengne J (2019) Multiple coexisting attractors in a generalized Chua’s circuit with a smoothly adjustable symmetry and nonlinearity. J Phys Math 10(298):0902–2090
Google Scholar
Verma R, Sharma AK (2020) Cryptography: avalanche effect of AES and RSA. Int J Sci Res Publ 10(4):119–25
Google Scholar
Castro JCH, Sierra JM, Seznec A, Izquierdo A, Ribagorda A (2005) The strict avalanche criterion randomness test. Math Comput Simul 68(1):1–7
Article MathSciNet MATH Google Scholar
Wu Y, Noonan JP, Agaian S (2011) NPCR And UACI randomness tests for image encryption. Cyber J: J Sel Areas in Telecommun(JSAT) 1:31–38
Google Scholar
Zeghid M, Machhout M, Khriji L, Baganne A, Tourki R (2007) A modified AES based algorithm for image encryption. World Acad Sci Eng Technol 27:206–211
Google Scholar

Download references

Author information

Authors and Affiliations

Military College of Signals, National University of Sciences and Technology (NUST), Islamabad, Pakistan
Maemoona Kayani & Abdul Ghafoor
Center for Advanced Studies in Telecommunication (CAST), COMSATS, Islamabad, Pakistan
M. Mohsin Riaz

Authors

Maemoona Kayani
View author publications
You can also search for this author inPubMed Google Scholar
Abdul Ghafoor
View author publications
You can also search for this author inPubMed Google Scholar
M. Mohsin Riaz
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Abdul Ghafoor.

Ethics declarations

Conflict of interest

All authors have participated in (a) conception and design, or analysis and interpretation of the data; (b) drafting the article or revising it critically for important intellectual content; and (c) approval of the final version. This manuscript has not been submitted to, nor is under review at, another journal or other publishing venue. The authors have no affiliation with any organization with a direct or indirect financial interest in the subject matter discussed in the manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kayani, M., Ghafoor, A. & Riaz, M.M. Multi-modal text recognition and encryption in scanned document images. J Supercomput 79, 7916–7936 (2023). https://doi.org/10.1007/s11227-022-04912-7

Download citation

Accepted: 24 September 2022
Published: 09 December 2022
Issue Date: May 2023
DOI: https://doi.org/10.1007/s11227-022-04912-7

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-modal text recognition and encryption in scanned document images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Analytical Survey of Image Encryption Techniques Used in Various Application Domains

Grey-Level Text Encryption Using Chaotic Maps and Arnold Transform

Extraction and Identification of Manipuri and Mizo Texts from Scene and Document Images

Explore related subjects

Data availibility statement

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Subscribe and save

Buy Now