research-article

Text Captcha Is Dead? A Large Scale Deployment and Empirical Study

Authors:

Changchang Liu,

Ting WangAuthors Info & Claims

CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security

Pages 1391 - 1406

https://doi.org/10.1145/3372297.3417258

Published: 02 November 2020 Publication History

Abstract

The development of deep learning techniques has significantly increased the ability of computers to recognize CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart), thus breaking or mitigating the security of existing captcha schemes. To protect against these attacks, recent works have been proposed to leverage adversarial machine learning to perturb captcha pictures. However, they either require the prior knowledge of captcha solving models or lack adaptivity to the evolving behaviors of attackers. Most importantly, none of them has been deployed in practical applications, and their practical applicability and effectiveness are unknown.

In this work, we introduce advCAPTCHA, a practical adversarial captcha generation system that can defend against deep learning based captcha solvers, and deploy it on a large-scale online platform with near billion users. To the best of our knowledge, this is the first such work that has been deployed on international large-scale online platforms. By applying adversarial learning techniques in a novel manner, advCAPTCHA can generate effective adversarial captchas to significantly reduce the success rate of attackers, which has been demonstrated by a large-scale online study. Furthermore, we also validate the feasibility of advCAPTCHA in practical applications, as well as its robustness in defending against various attacks. We leverage the existing user risk analysis system to identify potential attackers and serve advCAPTCHA to them. We then use their answers as queries to the attack model. In this manner, advCAPTCHA can be adapted/fine-tuned to accommodate the attack model evolution. Overall, advCAPTCHA can serve as a key enabler for generating robust captchas in practice and providing useful guidelines for captcha developers and practitioners.

Supplementary Material

MOV File (Copy of CCS2020_fp210_ChenghuiShi - Brian Hollendyke.mov)

Presentation video

Download
337.46 MB

References

[1]

[n. d.]. https://www.deathbycaptcha.com. ([n. d.]).

[2]

[n. d.]. http://www.captchatronix.com. ([n. d.]).

[3]

[n. d.]. https://pypi.org/project/captcha/. ([n. d.]).

[4]

N. Akhtar and A. Mian. 2018. Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey. IEEE Access (2018).

[5]

Elie Bursztein, Jonathan Aigrain, Angelika Moscicki, and John C. Mitchell. [n. d.] a. The End is Nigh: Generic Solving of Text-based CAPTCHAs. In 8th USENIX Workshop on Offensive Technologies (WOOT 14).

[6]

Elie Bursztein, Matthieu Martin, and John Mitchell. [n. d.] b. Text-based CAPTCHA Strengths and Weaknesses. In CCS '11.

[7]

Michal Busta, Lukas Neumann, and Jiri Matas. 2017. Deep textspotter: An end-to-end trainable scene text localization and recognition framework. In Proceedings of the IEEE International Conference on Computer Vision (ICCV).

[8]

Nicholas Carlini and David Wagner. [n. d.]. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy (SP) 2017.

[9]

N. Carlini and D. Wagner. 2017. Adversarial examples are not easily detected: Bypassing ten detection methods. AISec (2017).

[10]

Kumar Chellapilla and Patrice Y. Simard. 2005. Using Machine Learning to Break Visual Human Interaction Proofs (HIPs). In Advances in Neural Information Processing Systems 17. MIT Press.

[11]

Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. [n. d.]. ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks Without Training Substitute Models. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (AISec '17).

[12]

Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. 2018. Boosting adversarial attacks with momentum. In Proceedings of the IEEE conference on computer vision and pattern recognition.

[13]

Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno, and Dawn Song. 2017. Robust physical-world attacks on deep learning models. arXiv preprint arXiv:1707.08945 (2017).

[14]

Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. [n. d.]. Model Inversion Attacks That Exploit Confidence Information and Basic Countermeasures. In CCS '15.

[15]

Haichang Gao, Wei Wang, Jiao Qi, Xuqin Wang, Xiyang Liu, and Jeff Yan. [n. d.]. The robustness of hollow CAPTCHAs. In CCS '13.

[16]

Haichang Gao, Jeff Yan, Fang Cao, Zhengya Zhang, Lei Lei, Mengyun Tang, Ping Zhang, Xin Zhou, Xuqin Wang, and Jiawei Li. 2016. A Simple Generic Attack on Text Captchas. In NDSS 2016.

[17]

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. [n. d.]. Explaining and harnessing adversarial examples. In ICLR 2015.

[18]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition.

[19]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. (1997).

[20]

K. Hwang, C. Huang, and G. You. [n. d.]. A Spelling Based CAPTCHA System by Using Click. In International Symposium on Biometrics and Security Technologies 2012.

Digital Library

[21]

Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy Lin. [n. d.]. Black-box adversarial attacks with limited queries and information. In ICML 2018.

[22]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[23]

Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. [n. d.]. Adversarial examples in the physical world. ICLR 2017 ([n. d.]).

[24]

Jonathan Lazar, Jinjuan Feng, Tim Brooks, Genna Melamed, Brian Wentz, Jonathan Holman, Abiodun Olalere, and Nnanna Ekedebe. [n. d.]. The SoundsRight CAPTCHA: an improved approach to audio human interaction proofs for blind users. In CHI 2012.

Digital Library

[25]

Yann LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner, et al. 1998. Gradient-based learning applied to document recognition. Proc. IEEE (1998).

[26]

Keaton Mowery and Hovav Shacham. [n. d.]. Pixel perfect: Fingerprinting canvas in HTML5. ([n. d.]).

[27]

Yoichi Nakaguro, Matthew Dailey, Sanparith Marukatat, and Stanislav Makhanov. [n. d.]. Defeating line-noise CAPTCHAs with multiple quadratic snakes. Computers Security 2013 ( [n. d.]).

[28]

N. Narodytska and S. P. Kasiviswanathan. 2017. Simple black-box adversarial perturbations for deep networks. In IEEE Conference on Computer Vision and Pattern Recognition.

[29]

Margarita Osadchy, Julio Hernandez-Castro, Stuart Gibson, Orr Dunkelman, and Daniel Pérez-Cabo. 2017. No bot expects the DeepCAPTCHA! Introducing immutable adversarial examples, with applications to CAPTCHA generation. IEEE Transactions on Information Forensics and Security (2017).

Digital Library

[30]

Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. [n. d.] a. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security.

[31]

Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. [n. d.] b. The limitations of deep learning in adversarial settings. In IEEE European Symposium on Security and Privacy (EuroS&P) 2016.

[32]

S. K. Saha, A. K. Nag, and D. Dasgupta. 2015. Human-Cognition-Based CAPTCHAs. IT Professional (2015).

[33]

B. Shi, X. Bai, and C. Yao. 2017. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2017).

Digital Library

[34]

Baoguang Shi, Xinggang Wang, Pengyuan Lyu, Cong Yao, and Xiang Bai. 2016. Robust scene text recognition with automatic rectification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]

Chenghui Shi, Xiaogang Xu, Shouling Ji, Kai Bu, Jianhai Chen, Raheem A. Beyah, and Ting Wang. 2019. Adversarial CAPTCHAs. CoRR, Vol. abs/1901.01107 (2019). arxiv: 1901.01107

[36]

S. Sivakorn, I. Polakis, and A. D. Keromytis. [n. d.]. I am Robot: (Deep) Learning to Break Semantic Image CAPTCHAs. In 2016 IEEE European Symposium on Security and Privacy (EuroS P).

[37]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition.

[38]

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).

[39]

Florian Tramer, Fan Zhang, Ari Juels, Michael K Reiter, and Thomas Ristenpart. [n. d.]. Stealing machine learning models via prediction apis. In Usenix Security 2016.

[40]

H. Weng, B. Zhao, S. Ji, J. Chen, T. Wang, Q. He, and R. Beyah. 2019. Towards understanding the security of modern image captchas and underground captcha-solving services. Big Data Mining and Analytics, Vol. 2, 2 (2019), 118--144.

[41]

X. Wu, M. Fredrikson, S. Jha, and J. F. Naughton. [n. d.]. A Methodology for Formalizing Model-Inversion Attacks. In 2016 IEEE 29th Computer Security Foundations Symposium (CSF).

[42]

Y Xu, G Reynaga, Sonia Chiasson, J.-M. Frahm, Fabian Monrose, and Paul van Oorschot. [n. d.]. Security and Usability Challenges of Moving-Object CAPTCHAs: Decoding Codewords in Motion. In Usenix Security 2012.

[43]

J. Yan and A. S. E. Ahmad. [n. d.]. Breaking Visual CAPTCHAs with Naive Pattern Recognition Algorithms. In Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).

[44]

Jeff Yan and Ahmad Salah El Ahmad. [n. d.]. A Low-cost Attack on a Microsoft Captcha. In CCS '08.

[45]

Guixin Ye, Zhanyong Tang, Dingyi Fang, Zhanxing Zhu, Yansong Feng, Pengfei Xu, Xiaojiang Chen, and Zheng Wang. [n. d.]. Yet Another Text Captcha Solver: A Generative Adversarial Network Based Approach. In CCS '18.

[46]

Xiaoyong Yuan, Pan He, Qile Zhu, and Xiaolin Li. 2019. Adversarial examples: Attacks and defenses for deep learning. IEEE transactions on neural networks and learning systems (2019).

[47]

Bin B Zhu, Jeff Yan, Qiujie Li, Chao Yang, Jia Liu, Ning Xu, Meng Yi, and Kaiwei Cai. [n. d.]. Attacks and design of image recognition CAPTCHAs. In CCS '10.

Cited By

Wang PGao HGuo XYuan ZNian J(2024)Improving the Security of Audio CAPTCHAs With Adversarial ExamplesIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.323636721:2(650-667)Online publication date: Mar-2024
https://doi.org/10.1109/TDSC.2023.3236367
Qiu GTang GLi CLuo LGuo DShen Y(2024)A Complete and Comprehensive Semantic Perception of Mobile Traveling for Mobile Communication ServicesIEEE Internet of Things Journal10.1109/JIOT.2023.330747811:3(5467-5490)Online publication date: 1-Feb-2024
https://doi.org/10.1109/JIOT.2023.3307478
Xu ZYan Q(2024)Boosting the transferability of adversarial CAPTCHAsComputers and Security10.1016/j.cose.2024.104000145:COnline publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.cose.2024.104000
Show More Cited By

Index Terms

Text Captcha Is Dead? A Large Scale Deployment and Empirical Study
1. Security and privacy
  1. Security services
    1. Access control
    2. Authentication
      1. Graphical / visual passwords

Recommendations

Gotta CAPTCHA ’Em All: A Survey of 20 Years of the Human-or-computer Dilemma
A recent study has found that malicious bots generated nearly a quarter of overall website traffic in 2019 [102]. These malicious bots perform activities such as price and content scraping, account creation and takeover, credit card fraud, denial of ...
Yet Another Text Captcha Solver: A Generative Adversarial Network Based Approach
CCS '18: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security

Despite several attacks have been proposed, text-based CAPTCHAs are still being widely used as a security mechanism. One of the reasons for the pervasive use of text captchas is that many of the prior attacks are scheme-specific and require a labor-...
A Case Study of Text-Based CAPTCHA Attacks
CYBERC '12: Proceedings of the 2012 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery

CAPTCHA (Completely Automated Public Turing Test to tell Computers and Human Apart) is widely used than before, which becomes the common part of current website login system. However, the CAPTCHA implementation is tricky and risky without deliberate ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security

October 2020

2180 pages

ISBN:9781450370899

DOI:10.1145/3372297

General Chairs:
Jay Ligatti
University of South Florida, USA
,
Xinming Ou
University of South Florida, USA
,
Program Chairs:
Jonathan Katz
University of Maryland, USA
,
Giovanni Vigna
University of California-Santa Barbara, USA

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 November 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CCS '20

Sponsor:

SIGSAC

CCS '20: 2020 ACM SIGSAC Conference on Computer and Communications Security

November 9 - 13, 2020

Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 13 - 17, 2025

Taipei , Taiwan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

17
Total Citations
View Citations
767
Total Downloads

Downloads (Last 12 months)101
Downloads (Last 6 weeks)11

Reflects downloads up to 14 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wang PGao HGuo XYuan ZNian J(2024)Improving the Security of Audio CAPTCHAs With Adversarial ExamplesIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.323636721:2(650-667)Online publication date: Mar-2024
https://doi.org/10.1109/TDSC.2023.3236367
Qiu GTang GLi CLuo LGuo DShen Y(2024)A Complete and Comprehensive Semantic Perception of Mobile Traveling for Mobile Communication ServicesIEEE Internet of Things Journal10.1109/JIOT.2023.330747811:3(5467-5490)Online publication date: 1-Feb-2024
https://doi.org/10.1109/JIOT.2023.3307478
Xu ZYan Q(2024)Boosting the transferability of adversarial CAPTCHAsComputers and Security10.1016/j.cose.2024.104000145:COnline publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.cose.2024.104000
Li JFu LYang SWei H(2024)HiEI: A Universal Framework for Generating High-quality Emerging Images from Natural ImagesComputer Vision – ECCV 202410.1007/978-3-031-72751-1_8(129-145)Online publication date: 26-Oct-2024
https://doi.org/10.1007/978-3-031-72751-1_8
Alsuhibany S(2023)A Survey on Adversarial Perturbations and Attacks on CAPTCHAsApplied Sciences10.3390/app1307460213:7(4602)Online publication date: 5-Apr-2023
https://doi.org/10.3390/app13074602
Qiu GTang GLi CLuo LGuo DShen Y(2023)Differentiated Location Privacy Protection in Mobile Communication Services: A Survey from the Semantic Perception PerspectiveACM Computing Surveys10.1145/361758956:3(1-36)Online publication date: 5-Oct-2023
https://dl.acm.org/doi/10.1145/3617589
Wang PGao HGuo XXiao CQi FYan Z(2023)An Experimental Investigation of Text-based CAPTCHA Attacks and Their RobustnessACM Computing Surveys10.1145/355975455:9(1-38)Online publication date: 16-Jan-2023
https://dl.acm.org/doi/10.1145/3559754
Wang PGao HXiao CGuo XGao YZi Y(2023)Extended Research on the Security of Visual Reasoning CAPTCHAIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2023.323840820:6(4976-4992)Online publication date: Nov-2023
https://doi.org/10.1109/TDSC.2023.3238408
Fu YSun GYang HHuang JWang H(2023)Fighting Attacks on Large Character Set CAPTCHAs Using Transferable Adversarial Examples2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191881(1-10)Online publication date: 18-Jun-2023
https://doi.org/10.1109/IJCNN54540.2023.10191881
Hajyan MHosseni AToosi RAkhaee M(2023)Farsi CAPTCHA Recognition Using Attention-Based Convolutional Neural Network2023 9th International Conference on Web Research (ICWR)10.1109/ICWR57742.2023.10139078(221-226)Online publication date: 3-May-2023
https://doi.org/10.1109/ICWR57742.2023.10139078
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten