Skip to main content
Log in

Password and Passphrase Guessing with Recurrent Neural Networks

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

Most online services continue their reliance on text-based passwords as the primary authentication mechanism. With a growing number of these services and the limited creativity to devise new memorable passwords, users tend to reuse their passwords across multiple platforms. These factors, combined with the increasing number of leaked passwords, make passwords vulnerable to cross-site guessing attacks. Over the years, researchers have proposed several prevalent methods to predict subsequently used passwords, such as dictionary attacks, rule-based approaches, neural networks, and combinations of the above. We exploit the correlation between the similarity and predictability of these subsequent passwords in a dataset of 28.8 million users and their 61.5 million passwords. We use a rule-based approach but delegate rule derivation, classification, and prediction to a Recurrent Neural Network (RNN). We limit the number of guessing attempts to ten yet get an astonishingly high prediction accuracy of up to 83% in under five attempts, twice as much as any other known model. The result makes our model effective for targeted online password guessing without getting spotted or locked out. To the best of our knowledge, this study is the first attempt of its kind using RNN. We also explore the use of RNN models in passphrase guessing. Passphrases are perceived to be more secure and easier to remember than passwords of the same length. We use a dataset that contains around 100,000 distinct phrases. We demonstrate that RNN models can predict complete passphrases given the initial word with rate up to 40%, which is twice better than other known approaches. Furthermore, our predictions can succeed in under 5,000 attempts, a 100% improvement compared to existing algorithms. In addition, this approach provides ease of deployment and low resource consumption. To our knowledge, it is the first attempt to exploit RNN for passphrase guessing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Blanchard, N.K., Malaingre, C., & Selker, T. (2018). Improving security and usability of passphrases with guided word choice. In 34th annual computer security applications conference (pp. 723–732)

  • Bonneau, J., & Shutova, E. (2012). Linguistic properties of multi-word passphrases. In International conference on financial cryptography and data security (pp. 1–12). Springer

  • Brostoff, S., & Sasse, M.A. (2003). Ten strikes and you’re out”: Increasing the number of login attempts can improve password usability. In CHI 2003 workshop on human-computer interaction and security systems

  • Burr, W., Dodson, D., Perlner, R., Gupta, S., & Nabbus, E. (2013). NIST SP800-63-2: Electronic authentication guideline. Technical report, National Institute of Standards and Technology, Reston, VA

  • Cho, K., Van Merriënboer, B., Bahdanau, D., & Bengio, Y. (2014). On the properties of neural machine translation: Encoder-decoder approaches. arXiv:1409.1259

  • Das, A., Bonneau, J., Caesar, M., Borisov, N., & Wang, X. (2014). The tangled web of password reuse. In: NDSS (Vol. 14, pp. 23–26)

  • Davies, M. (2009). The 385+ million word corpus of contemporary American English (1990–2008+): Design, architecture, and linguistic insights. International Journal of Corpus Linguistics, 14(2), 159–190.

    Article  Google Scholar 

  • Florencio, D., & Herley, C. (2007). A large-scale study of web password habits. In 16th International conference on World Wide Web (pp. 657–666)

  • Fu, C., Duan, M., Dai, X., Wei, Q., Wu, Q., & Zhou, R. (2021). Densegan: A password guessing model based on densenet and passgan. In International conference on information security practice and experience (pp. 296–305). Springer

  • Grassi, P. A., Garcia, M. E., & Fenton, J. L. (2017). DRAFT NIST SP800-63-3 digital identity guidelines. Technical report, National Institute of Standards and Technology, Los Altos, CA

  • Han, W., Xu, M., Zhang, J., Wang, C., Zhang, K., & Wang, X. S. (2020). TransPCFG: transferring the grammars from short passwords to guess long passwords effectively. IEEE Transactions on Information Forensics and Security, 16, 451–465.

    Article  Google Scholar 

  • Haque, S.T., Wright, M., & Scielzo, S. (2013). A study of user password strategy for multiple accounts. In 3rd ACM conference on data and application security and privacy (pp. 173–176)

  • Hardeniya, N. (2015). NLTK Essentials, p. 28. Packt Publishing Ltd

  • Hitaj, B., Gasti, P., Ateniese, G., & Perez-Cruz, F. (2019). Passgan: A deep learning approach for password guessing. In International conference on applied cryptography and network security (pp. 217–237). Springer

  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.

    Article  Google Scholar 

  • Huth, A., Orlando, M., & Pesante, L. (2012). Password security, protection, and management. United States Computer Emergency Readiness Team

  • Joudaki, Z., Thorpe, J., & Martin, M.V. (2018). Reinforcing system-assigned passphrases through implicit learning. In 2018 ACM conference on computer and communications security (pp. 1533–1548)

  • Keith, M., Shao, B., & Steinbart, P. (2005). The effectiveness and usability of passphrases for authentication. In 11th Americas conference on information systems (pp. 3354–3357)

  • Kingma, D.P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980

  • Kouretas, I., & Paliouras, V. (2019). Simplified hardware implementation of the softmax activation function. In 2019 8th international conference on modern circuits and systems technologies (MOCAST) (pp. 1–4). IEEE

  • Kuo, C., Romanosky, S., & Cranor, L.F. (2006). Human selection of mnemonic phrase-based passwords. In Second symposium on usable privacy and security (SOUPS) (pp. 67–78)

  • Kurzban, S. A. (1985). Easily remembered passphrases: a better approach. ACM SIGSAC Review, 3(2–4), 10–21.

    Article  Google Scholar 

  • Labrande, H. (2015). Crack me I’m famous: cracking weak passphrases using publicly-available sources. In 2015 Information and Communications Technology Security Symposium (SSTIC) (pp. 479–484).

  • Li, H., Chen, M., Yan, S., Jia, C., & Li, Z. (2019). Password guessing via neural language modeling. In Proceedings of the international conference on machine learning for cyber security (pp. 78–93). Springer

  • Liu, Y., Xia, Z., Yi, P., Yao, Y., Xie, T., Wang, W., & Zhu, T. (2018). GENPass: A general deep learning model for password guessing with PCFG rules and adversarial generation. In 2018 IEEE International Conference on Communications (ICC) (pp. 1–6). IEEE

  • Melicher, W., Ur, B., Segreti, S.M., Komanduri, S., Bauer, L., Christin, N., & Cranor, L.F. (2016). Fast, lean, and accurate: Modeling password guessability using neural networks. In 25th USENIX security symposium (pp. 175–191)

  • Murray, H., & Malone, D. (2018). Exploring the impact of password dataset distribution on guessing. In: 2018 16th annual conference on privacy, security and trust (PST) (pp. 1–8). IEEE

  • Narayanan, A., & Shmatikov, V. (2005). Fast dictionary attacks on passwords using time-space tradeoff. In 12th ACM conference on computer and communications security (pp. 364–372)

  • Nosenko, A., Cheng, Y., & Chen, H. (2021). Learning password modification patterns with recurrent neural networks. In International conference on secure knowledge management in artificial intelligence era (pp. 110–129). Springer

  • Notoatmodjo, G., & Thomborson, C. (2009). Passwords and perceptions. In: Seventh Australasian conference on information security (pp. 71–78)

  • Pasquini, D., Gangwal, A., Ateniese, G., Bernaschi, M., & Conti, M. (2021). Improving password guessing via representation learning. In 2021 42nd IEEE symposium on security and privacy (pp. 1382–1399). IEEE

  • Pennington, J., Socher, R., & Manning, C.D. (2014). Glove: Global vectors for word representation. In 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532–1543)

  • Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.

    Google Scholar 

  • Rao, A., Jha, B., & Kini, G. (2013). Effect of grammar on security of long passwords. In Third ACM conference on data and application security and privacy (pp. 317–324)

  • Rawlings, R. (2020). Password Habits in the US and the UK: This Is What We Found.https://nordpass.com/blog/password-habits-statistics/. Accessed 26 July 2022.

  • Schumacher, M., Roßner, R., & Vach, W. (1996). Neural networks and logistic regression: Part I. Computational Statistics & Data Analysis, 21(6), 661–682.

    Article  Google Scholar 

  • Sparell, P., & Simovits, M. (2016). Linguistic cracking of passphrases using Markov chains. Cryptology ePrint Archive

  • Stobert, E., & Biddle, R. (2014). The password life cycle: user behaviour in managing passwords. In: 10th symposium on usable privacy and security (SOUPS) (pp. 243–255)

  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., & Polosukhin, I (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30

  • Von Zezschwitz, E., De Luca, A., & Hussmann, H. (2013). Survival of the shortest: A retrospective analysis of influencing factors on password composition. In IFIP Conference on Human-Computer Interaction (pp. 460–467). Springer

  • Walia, K.S., Shenoy, S., & Cheng, Y. (2020). An empirical analysis on the usability and security of passwords. In 2020 21st IEEE international conference on information reuse and integration for data science (IRI) (pp. 1–8). IEEE

  • Wang, C., Jan, S.T., Hu, H., Bossart, D., & Wang, G. (2018). The next domino to fall: Empirical analysis of user passwords across online services. In 8th ACM conference on data and application security and privacy (pp. 196–203)

  • Wang, D., Zhang, Z., Wang, P., Yan, J., & Huang, X. (2016). Targeted online password guessing: An underestimated threat. In 2016 ACM conference on computer and communications security (pp. 1242–1254)

  • Weir, M., Aggarwal, S., De Medeiros, B., & Glodek, B. (2009). Password cracking using probabilistic context-free grammars. In 2009 30th IEEE symposium on security and privacy (pp. 391–405). IEEE

  • Woo, S.S., & Mirkovic, J. (2016). Improving recall and security of passphrases through use of mnemonics. In 10th International conference on passwords

  • Xie, Z., Zhang, M., Yin, A., & Li, Z. (2020). A new targeted password guessing model. In Australasian conference on information security and privacy (pp. 350–368). Springer

  • Xu, M., Wang, C., Yu, J., Zhang, J., Zhang, K., & Han, W. (2021). Chunk-level password guessing: Towards modeling refined password composition representations. In 2021 ACM conference on computer and communications security (pp. 5–20)

  • Xu, G., Meng, Y., Qiu, X., Yu, Z., & Wu, X. (2019). Sentiment analysis of comment texts based on BiLSTM. IEEE Access, 7, 51522–51532.

    Article  Google Scholar 

  • Yoo, J.Y., Morris, J.X., Lifland, E., & Qi, Y. (2020). Searching for a search method: Benchmarking search algorithms for generating NLP adversarial examples. arXiv:2009.06368

  • Zhang, Y., Monrose, F., & Reiter, M.K. (2010). The security of modern password expiration: An algorithmic framework and empirical analysis. In 17th ACM conference on computer and communications security (pp. 176–186)

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuan Cheng.

Ethics declarations

A preliminary version of this article was presented at SKM ’21 (Nosenko et al., 2021).

Financial and non-financial interests

The authors have no relevant financial or non-financial interests to declare that are relevant to the content of this manuscript.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nosenko, A., Cheng, Y. & Chen, H. Password and Passphrase Guessing with Recurrent Neural Networks. Inf Syst Front 25, 549–565 (2023). https://doi.org/10.1007/s10796-022-10325-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-022-10325-x

Keywords

Navigation