Password and Passphrase Guessing with Recurrent Neural Networks

Nosenko, Alex; Cheng, Yuan; Chen, Haiquan

doi:10.1007/s10796-022-10325-x

Password and Passphrase Guessing with Recurrent Neural Networks

Published: 27 August 2022

Volume 25, pages 549–565, (2023)
Cite this article

Information Systems Frontiers Aims and scope Submit manuscript

517 Accesses
2 Citations
3 Altmetric
Explore all metrics

Abstract

Most online services continue their reliance on text-based passwords as the primary authentication mechanism. With a growing number of these services and the limited creativity to devise new memorable passwords, users tend to reuse their passwords across multiple platforms. These factors, combined with the increasing number of leaked passwords, make passwords vulnerable to cross-site guessing attacks. Over the years, researchers have proposed several prevalent methods to predict subsequently used passwords, such as dictionary attacks, rule-based approaches, neural networks, and combinations of the above. We exploit the correlation between the similarity and predictability of these subsequent passwords in a dataset of 28.8 million users and their 61.5 million passwords. We use a rule-based approach but delegate rule derivation, classification, and prediction to a Recurrent Neural Network (RNN). We limit the number of guessing attempts to ten yet get an astonishingly high prediction accuracy of up to 83% in under five attempts, twice as much as any other known model. The result makes our model effective for targeted online password guessing without getting spotted or locked out. To the best of our knowledge, this study is the first attempt of its kind using RNN. We also explore the use of RNN models in passphrase guessing. Passphrases are perceived to be more secure and easier to remember than passwords of the same length. We use a dataset that contains around 100,000 distinct phrases. We demonstrate that RNN models can predict complete passphrases given the initial word with rate up to 40%, which is twice better than other known approaches. Furthermore, our predictions can succeed in under 5,000 attempts, a 100% improvement compared to existing algorithms. In addition, this approach provides ease of deployment and low resource consumption. To our knowledge, it is the first attempt to exploit RNN for passphrase guessing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine learning and deep learning

Article Open access 08 April 2021

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

Deep learning modelling techniques: current progress, applications, advantages, and challenges

Article Open access 17 April 2023

References

Blanchard, N.K., Malaingre, C., & Selker, T. (2018). Improving security and usability of passphrases with guided word choice. In 34th annual computer security applications conference (pp. 723–732)
Bonneau, J., & Shutova, E. (2012). Linguistic properties of multi-word passphrases. In International conference on financial cryptography and data security (pp. 1–12). Springer
Brostoff, S., & Sasse, M.A. (2003). Ten strikes and you’re out”: Increasing the number of login attempts can improve password usability. In CHI 2003 workshop on human-computer interaction and security systems
Burr, W., Dodson, D., Perlner, R., Gupta, S., & Nabbus, E. (2013). NIST SP800-63-2: Electronic authentication guideline. Technical report, National Institute of Standards and Technology, Reston, VA
Cho, K., Van Merriënboer, B., Bahdanau, D., & Bengio, Y. (2014). On the properties of neural machine translation: Encoder-decoder approaches. arXiv:1409.1259
Das, A., Bonneau, J., Caesar, M., Borisov, N., & Wang, X. (2014). The tangled web of password reuse. In: NDSS (Vol. 14, pp. 23–26)
Davies, M. (2009). The 385+ million word corpus of contemporary American English (1990–2008+): Design, architecture, and linguistic insights. International Journal of Corpus Linguistics, 14(2), 159–190.
Article Google Scholar
Florencio, D., & Herley, C. (2007). A large-scale study of web password habits. In 16th International conference on World Wide Web (pp. 657–666)
Fu, C., Duan, M., Dai, X., Wei, Q., Wu, Q., & Zhou, R. (2021). Densegan: A password guessing model based on densenet and passgan. In International conference on information security practice and experience (pp. 296–305). Springer
Grassi, P. A., Garcia, M. E., & Fenton, J. L. (2017). DRAFT NIST SP800-63-3 digital identity guidelines. Technical report, National Institute of Standards and Technology, Los Altos, CA
Han, W., Xu, M., Zhang, J., Wang, C., Zhang, K., & Wang, X. S. (2020). TransPCFG: transferring the grammars from short passwords to guess long passwords effectively. IEEE Transactions on Information Forensics and Security, 16, 451–465.
Article Google Scholar
Haque, S.T., Wright, M., & Scielzo, S. (2013). A study of user password strategy for multiple accounts. In 3rd ACM conference on data and application security and privacy (pp. 173–176)
Hardeniya, N. (2015). NLTK Essentials, p. 28. Packt Publishing Ltd
Hitaj, B., Gasti, P., Ateniese, G., & Perez-Cruz, F. (2019). Passgan: A deep learning approach for password guessing. In International conference on applied cryptography and network security (pp. 217–237). Springer
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
Article Google Scholar
Huth, A., Orlando, M., & Pesante, L. (2012). Password security, protection, and management. United States Computer Emergency Readiness Team
Joudaki, Z., Thorpe, J., & Martin, M.V. (2018). Reinforcing system-assigned passphrases through implicit learning. In 2018 ACM conference on computer and communications security (pp. 1533–1548)
Keith, M., Shao, B., & Steinbart, P. (2005). The effectiveness and usability of passphrases for authentication. In 11th Americas conference on information systems (pp. 3354–3357)
Kingma, D.P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980
Kouretas, I., & Paliouras, V. (2019). Simplified hardware implementation of the softmax activation function. In 2019 8th international conference on modern circuits and systems technologies (MOCAST) (pp. 1–4). IEEE
Kuo, C., Romanosky, S., & Cranor, L.F. (2006). Human selection of mnemonic phrase-based passwords. In Second symposium on usable privacy and security (SOUPS) (pp. 67–78)
Kurzban, S. A. (1985). Easily remembered passphrases: a better approach. ACM SIGSAC Review, 3(2–4), 10–21.
Article Google Scholar
Labrande, H. (2015). Crack me I’m famous: cracking weak passphrases using publicly-available sources. In 2015 Information and Communications Technology Security Symposium (SSTIC) (pp. 479–484).
Li, H., Chen, M., Yan, S., Jia, C., & Li, Z. (2019). Password guessing via neural language modeling. In Proceedings of the international conference on machine learning for cyber security (pp. 78–93). Springer
Liu, Y., Xia, Z., Yi, P., Yao, Y., Xie, T., Wang, W., & Zhu, T. (2018). GENPass: A general deep learning model for password guessing with PCFG rules and adversarial generation. In 2018 IEEE International Conference on Communications (ICC) (pp. 1–6). IEEE
Melicher, W., Ur, B., Segreti, S.M., Komanduri, S., Bauer, L., Christin, N., & Cranor, L.F. (2016). Fast, lean, and accurate: Modeling password guessability using neural networks. In 25th USENIX security symposium (pp. 175–191)
Murray, H., & Malone, D. (2018). Exploring the impact of password dataset distribution on guessing. In: 2018 16th annual conference on privacy, security and trust (PST) (pp. 1–8). IEEE
Narayanan, A., & Shmatikov, V. (2005). Fast dictionary attacks on passwords using time-space tradeoff. In 12th ACM conference on computer and communications security (pp. 364–372)
Nosenko, A., Cheng, Y., & Chen, H. (2021). Learning password modification patterns with recurrent neural networks. In International conference on secure knowledge management in artificial intelligence era (pp. 110–129). Springer
Notoatmodjo, G., & Thomborson, C. (2009). Passwords and perceptions. In: Seventh Australasian conference on information security (pp. 71–78)
Pasquini, D., Gangwal, A., Ateniese, G., Bernaschi, M., & Conti, M. (2021). Improving password guessing via representation learning. In 2021 42nd IEEE symposium on security and privacy (pp. 1382–1399). IEEE
Pennington, J., Socher, R., & Manning, C.D. (2014). Glove: Global vectors for word representation. In 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532–1543)
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.
Google Scholar
Rao, A., Jha, B., & Kini, G. (2013). Effect of grammar on security of long passwords. In Third ACM conference on data and application security and privacy (pp. 317–324)
Rawlings, R. (2020). Password Habits in the US and the UK: This Is What We Found.https://nordpass.com/blog/password-habits-statistics/. Accessed 26 July 2022.
Schumacher, M., Roßner, R., & Vach, W. (1996). Neural networks and logistic regression: Part I. Computational Statistics & Data Analysis, 21(6), 661–682.
Article Google Scholar
Sparell, P., & Simovits, M. (2016). Linguistic cracking of passphrases using Markov chains. Cryptology ePrint Archive
Stobert, E., & Biddle, R. (2014). The password life cycle: user behaviour in managing passwords. In: 10th symposium on usable privacy and security (SOUPS) (pp. 243–255)
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., & Polosukhin, I (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30
Von Zezschwitz, E., De Luca, A., & Hussmann, H. (2013). Survival of the shortest: A retrospective analysis of influencing factors on password composition. In IFIP Conference on Human-Computer Interaction (pp. 460–467). Springer
Walia, K.S., Shenoy, S., & Cheng, Y. (2020). An empirical analysis on the usability and security of passwords. In 2020 21st IEEE international conference on information reuse and integration for data science (IRI) (pp. 1–8). IEEE
Wang, C., Jan, S.T., Hu, H., Bossart, D., & Wang, G. (2018). The next domino to fall: Empirical analysis of user passwords across online services. In 8th ACM conference on data and application security and privacy (pp. 196–203)
Wang, D., Zhang, Z., Wang, P., Yan, J., & Huang, X. (2016). Targeted online password guessing: An underestimated threat. In 2016 ACM conference on computer and communications security (pp. 1242–1254)
Weir, M., Aggarwal, S., De Medeiros, B., & Glodek, B. (2009). Password cracking using probabilistic context-free grammars. In 2009 30th IEEE symposium on security and privacy (pp. 391–405). IEEE
Woo, S.S., & Mirkovic, J. (2016). Improving recall and security of passphrases through use of mnemonics. In 10th International conference on passwords
Xie, Z., Zhang, M., Yin, A., & Li, Z. (2020). A new targeted password guessing model. In Australasian conference on information security and privacy (pp. 350–368). Springer
Xu, M., Wang, C., Yu, J., Zhang, J., Zhang, K., & Han, W. (2021). Chunk-level password guessing: Towards modeling refined password composition representations. In 2021 ACM conference on computer and communications security (pp. 5–20)
Xu, G., Meng, Y., Qiu, X., Yu, Z., & Wu, X. (2019). Sentiment analysis of comment texts based on BiLSTM. IEEE Access, 7, 51522–51532.
Article Google Scholar
Yoo, J.Y., Morris, J.X., Lifland, E., & Qi, Y. (2020). Searching for a search method: Benchmarking search algorithms for generating NLP adversarial examples. arXiv:2009.06368
Zhang, Y., Monrose, F., & Reiter, M.K. (2010). The security of modern password expiration: An algorithmic framework and empirical analysis. In 17th ACM conference on computer and communications security (pp. 176–186)

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Santa Clara County Office of Education, 1290 Ridder Park Dr, San Jose, 95131, CA, USA
Alex Nosenko
Department of Computer Science, California State University, Sacramento, 6000 J Street, Sacramento, 95819, CA, USA
Yuan Cheng & Haiquan Chen

Authors

Alex Nosenko
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Haiquan Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuan Cheng.

Ethics declarations

A preliminary version of this article was presented at SKM ’21 (Nosenko et al., 2021).

Financial and non-financial interests

The authors have no relevant financial or non-financial interests to declare that are relevant to the content of this manuscript.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Nosenko, A., Cheng, Y. & Chen, H. Password and Passphrase Guessing with Recurrent Neural Networks. Inf Syst Front 25, 549–565 (2023). https://doi.org/10.1007/s10796-022-10325-x

Download citation

Accepted: 12 August 2022
Published: 27 August 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s10796-022-10325-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Password and Passphrase Guessing with Recurrent Neural Networks

Abstract

Access this article

Similar content being viewed by others

Machine learning and deep learning

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Deep learning modelling techniques: current progress, applications, advantages, and challenges

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Financial and non-financial interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Password and Passphrase Guessing with Recurrent Neural Networks

Abstract

Access this article

Similar content being viewed by others

Machine learning and deep learning

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Deep learning modelling techniques: current progress, applications, advantages, and challenges

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Financial and non-financial interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation