skip to main content
research-article

A Novel Attack on Monochrome and Greyscale Devanagari CAPTCHAs

Published: 26 May 2021 Publication History

Abstract

The use of computer programs in breaching web site security is common today. CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) and human interaction proofs are the cost-effective solution to these kinds of computer attacks on web sites. These CAPTCHAs are available in many forms, such as those based on text, images and audio. A CAPTCHA must be secure enough that it cannot be broken by a computer program, and it must be usable enough that humans can easily understand it. The most popular is the text-based scheme. Most text-based CAPTCHAs are based on the English language and are not usable by the native people of India. Research has proven that native people are more comfortable with native language–based CAPTCHA. Devanagari-based CAPTCHAs are also available, but the security aspect has not been tested. Unfortunately, English language–based CAPTCHAs are successfully broken. Therefore, it is important to test the security of Devanagari script-based CAPTCHAs. We picked five unique monochrome CAPTCHAs and five unique greyscale CAPTCHAs for testing security. We achieved 88.13% to 97.6% segmentation rates on these schemes and generated six types of features for these segmented characters, such as raw pixels, zoning, projection, Scale-Invariant Feature Transform (SIFT), Speeded-Up Robust Features (SURF) and Oriented Fast and Rotated BRIEF (ORB). For classification, we used three classifiers for comparative analyses. Using k-Nearest Neighbour (k-NN), Support Vector Machine (SVM) and Random Forest, we achieved high recognition on monochrome and greyscale schemes. For monochrome Devanagari CAPTCHAs, the recognition rate of k-NN ranges from 64.78% to 82.39%, SVM ranges from 76.46% to 91.34% and Random Forest ranges from 80.34% to 91.28%. For greyscale Devanagari CAPTCHAs, the recognition rate of k-NN ranges from 67.52% to 85.47%, SVM ranges from 76.9% to 91.71% and Random Forest ranges from 83.07% to 92.13%. We achieved a breaking rate for monochrome schemes of 66% to 85% and for greyscale schemes of 73% to 93%.

References

[1]
L. Ahn, M. Blum, and J. Langford. 2004. Telling humans and computers apart automatically. Communications of the ACM 47, 2 (2004), 56–60.
[2]
W. Al-Sudani, A. Gill, C. Li, J. Wang, and F. Liu. 2010. Protection through multimedia captchas. In Proceedings of the 8th International Conference on Advances in Mobile Computing and Multimedia. 63–68.
[3]
S. Alsuhibany. 2018. Generating Arabic handwritten CAPTCHA for cyber security. International Journal of Computer Science and Network Security 18, 3 (2018), 41–47.
[4]
M. Banday and N. Shah. 2011. Challenges of CAPTCHA in the accessibility of Indian regional websites. In Proceedings of the 4th Annual ACM Bangalore Conference. 1–4.
[5]
J. Chen, X. Luo, Y. Guo, Y. Zhang, and D. Gong. 2017. A survey on breaking technique of text-based CAPTCHA. Security and Communication Networks 2017 (2017), Article 6898617.
[6]
M. Kumar, M. Jindal, and R. Sharma. 2011. Review on OCR for handwritten Indian scripts character recognition. In Advances in Digital Image Processing and Information. Communications in Computer and Information Science, Vol. 205. Springer, 268–276.
[7]
M. Kumar, M. Jindal, and R. Sharma. 2019. Character and numeral recognition for non-Indic and Indic scripts: A survey. Artificial Intelligence Review 52 (2019), 2235–2261.
[8]
M. Kumar and H. Kaur. 2018. A comprehensive review on word recognition for non-Indic and Indic scripts. Pattern Analysis and Applications 21 (2018), 897–929.
[9]
R. Kumar and K. Ravulakollu. 2014. On the performance of Devnagari handwritten character recognition. World Applied Sciences Journal 31, 6 (2014), 1012–1019.
[10]
M. Kumar, R. Sharma, and M. Jindal. 2014. Efficient feature extraction techniques for offline handwritten Gurmukhi character recognition. National Academy Science Letters 37 (2014), 381–391.
[11]
Z. Noury and M. Rezaei. 2020. Deep-CAPTCHA: A deep learning based CAPTCHA solver for vulnerability assessment. arXiv:2006.08296
[12]
S. Shirali-Shahreza, H. Abolhassani, H. Sameti, and M. Shirali-Shahreza. 2009. Spoken CAPTCHA: A CAPTCHA system for blind users. In Proceedings of the 2009 ISECS International Colloquium on Computing, Communication, Control, and Management. 221–224.
[13]
M. Shirali-Shahreza and S. Shirali-Shahreza. 2007. Question-based CAPTCHA. In Proceedings of the International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’07). 54–58.
[14]
S. Sivakorn, I. Polakis, and K. Angelos. 2016. I am robot deep learning to break semantic image CAPTCHAs. In Proceedings of the 2016 IEEE European Symposium on Security and Privacy (EuroS&P’16). 388–403.
[15]
Y. Xu, G. Reynaga, S. Chiasson, and F. Jan-Michael. 2012. Security and usability challenges of moving-object CAPTCHAs: Decoding codewords in motion. In Proceedings of the 21st USENIX Security Symposium (Security’12). 49–64.
[16]
J. Yu, X. Ma, and T. Han. 2016. Usability investigation on the localization of text CAPTCHAs: Take Chinese characters as a case study. Transdisciplinary Engineering: A Paradigm Shift 5 (2016), 233–242.

Cited By

View all
  • (2024)Knowledge-based Data Processing for Multilingual Natural Language AnalysisACM Transactions on Asian and Low-Resource Language Information Processing10.1145/358368623:5(1-16)Online publication date: 10-May-2024
  • (2024)Semantic and Context Understanding for Sentiment Analysis in Hindi Handwritten Character Recognition Using a Multiresolution TechniqueACM Transactions on Asian and Low-Resource Language Information Processing10.1145/355789523:1(1-22)Online publication date: 15-Jan-2024
  • (2024)Image CAPTCHAs: When Deep Learning Breaks the MoldIEEE Access10.1109/ACCESS.2024.344297612(112211-112231)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing
ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 20, Issue 4
July 2021
419 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3465463
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 May 2021
Accepted: 01 November 2020
Revised: 01 September 2020
Received: 01 July 2020
Published in TALLIP Volume 20, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Devanagari CAPTCHA
  2. human interaction proofs
  3. monochrome Devanagari CAPTCHA
  4. greyscale Devanagari CAPTCHA
  5. Internet security
  6. bot
  7. reverse Turing test

Qualifiers

  • Research-article
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Knowledge-based Data Processing for Multilingual Natural Language AnalysisACM Transactions on Asian and Low-Resource Language Information Processing10.1145/358368623:5(1-16)Online publication date: 10-May-2024
  • (2024)Semantic and Context Understanding for Sentiment Analysis in Hindi Handwritten Character Recognition Using a Multiresolution TechniqueACM Transactions on Asian and Low-Resource Language Information Processing10.1145/355789523:1(1-22)Online publication date: 15-Jan-2024
  • (2024)Image CAPTCHAs: When Deep Learning Breaks the MoldIEEE Access10.1109/ACCESS.2024.344297612(112211-112231)Online publication date: 2024
  • (2024)Enhancing automated vehicle identification by integrating YOLO v8 and OCR techniques for high-precision license plate detection and recognitionScientific Reports10.1038/s41598-024-65272-114:1Online publication date: 22-Jun-2024
  • (2024)Machine Learning and Metaheuristic Algorithms for Voice-Based Authentication: A Mobile Banking Case StudyInternational Journal of Computational Intelligence Systems10.1007/s44196-024-00690-717:1Online publication date: 25-Nov-2024
  • (2024)Improving the hearing aid system using optimized variable bandwidth filter based on wolf optimizationMultimedia Tools and Applications10.1007/s11042-024-19748-xOnline publication date: 1-Jul-2024
  • (2024)A comparative study on facial image retrieval using local patternsMultimedia Tools and Applications10.1007/s11042-024-18311-y83:28(70637-70692)Online publication date: 5-Feb-2024
  • (2024)Intelligent communication of two humanoid robots based on computer visonMultimedia Tools and Applications10.1007/s11042-023-17989-w83:23(63459-63477)Online publication date: 12-Jan-2024
  • (2024)Enhancing copy-move forgery detection through a novel CNN architecture and comprehensive dataset analysisMultimedia Tools and Applications10.1007/s11042-023-17964-583:21(59783-59817)Online publication date: 2-Jan-2024
  • (2023)HindiPersonalityNet: Personality Detection in Hindi Conversational Data using Deep Learning with Static EmbeddingACM Transactions on Asian and Low-Resource Language Information Processing10.1145/3625228Online publication date: 29-Sep-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media