Skip to main content

NNBits: Bit Profiling with a Deep Learning Ensemble Based Distinguisher

  • Conference paper
  • First Online:
Topics in Cryptology – CT-RSA 2023 (CT-RSA 2023)

Abstract

We introduce a deep learning ensemble (NNBits) as a tool for bit-profiling and evaluation of cryptographic (pseudo) random bit sequences. On the one hand, we show how to use NNBits ensemble to explain parts of the seminal work of Gohr [16]: Gohr’s depth-1 neural distinguisher reaches a test accuracy of 78.3% in round 6 for SPECK32/64  [3]. Using the bit-level information provided by NNBits we can partially explain the accuracy obtained by Gohr (78.1% vs. 78.3%). This is achieved by constructing a distinguisher which only uses the information about correct or incorrect predictions on the single bit level and which achieves 78.1% accuracy. We also generalize two heuristic aspects in the construction of Gohr’s network: i) the particular input structure, which reflects expert knowledge of SPECK32/64, as well as ii) the cyclic learning rate.

On the other hand, we extend Gohr’s work as a statistical test on avalanche datasets of SPECK32/64, SPECK64/128, SPECK96/144, SPECK128/128, and AES-128. In combination with NNBits ensemble we use the extended version of Gohr’s neural network to draw a comparison with the NIST Statistical Test Suite (NIST STS) on the previously mentioned avalanche datasets. We compare NNBits in conjunction with Gohr’s generalized network to the NIST STS and conclude that the NNBits ensemble performs either as good as the NIST STS or better. Furthermore, we demonstrate cryptanalytic insights that result from bit-level profiling with NNBits, for example, we show how to infer the strong input difference \((0x0040, 0x0000)\) for SPECK32/64 or infer a signature of the multiplication in the Galois field of AES-128.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Note that this statement has limited practical implications: even if enough data, representational power of the network, as well as sufficient computational resources to train the network are given, the training itself may be an NP-hard problem [26].

  2. 2.

    This limit has recently been overpassed by human cryptanalysis in [8], giving machine learning a new threshold to overcome.

References

  1. Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: 12th \(\{\)USENIX\(\}\) Symposium on Operating Systems Design and Implementation (\(\{\)OSDI\(\}\) 2016), pp. 265–283 (2016)

    Google Scholar 

  2. Géron, A.: Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media (2019). https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/

  3. Bacuieti, N.N., Batina, L., Picek, S.: Deep neural networks aiding cryptanalysis : a case study of the Speck distinguisher. ePrint, pp. 1–24 (2022). https://eprint.iacr.org/2022/341

  4. Baksi, A., Breier, J., Chen, Y., Dong, X.: Machine learning assisted differential distinguishers for lightweight ciphers. In: 2021 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 176–181 (2021). https://doi.org/10.23919/DATE51398.2021.9474092

  5. Beaulieu, R., Shors, D., Smith, J., Treatman-Clark, S., Weeks, B., Wingers, L.: The SIMON and SPECK families of lightweight block ciphers. National Security Agency (NSA), 9800 Savage Road, Fort Meade, MD 20755, USA (2013)

    Google Scholar 

  6. Bellini, E., Rossi, M.: Performance comparison between deep learning-based and conventional cryptographic distinguishers. In: Arai, K. (ed.) Intelligent Computing. LNNS, vol. 285, pp. 681–701. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-80129-8_48

    Chapter  Google Scholar 

  7. Benamira, A., Gerault, D., Peyrin, T., Tan, Q.Q.: A deeper look at machine learning-based cryptanalysis. In: Canteaut, A., Standaert, F.-X. (eds.) EUROCRYPT 2021. LNCS, vol. 12696, pp. 805–835. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-77870-5_28

    Chapter  Google Scholar 

  8. Biryukov, A., dos Santos, L.C., Teh, J.S., Udovenko, A., Velichkov, V.: Meet-in-the-filter and dynamic counting with applications to speck. Cryptology ePrint Archive (2022)

    Google Scholar 

  9. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996). https://doi.org/10.1023/A:1018054314350

    Article  MATH  Google Scholar 

  10. Brown, R.G.: DieHarder: a GNU public license random number tester. Duke University Physics Department, Durham, NC 27708-0305 (2006). http://www.phy.duke.edu/~rgb/General/dieharder.php

  11. Castro, J.C.H., Sierra, J.M., Seznec, A., Izquierdo, A., Ribagorda, A.: The strict avalanche criterion randomness test. Math. Comput. Simul. 68(1), 1–7 (2005). https://doi.org/10.1016/j.matcom.2004.09.001

    Article  MathSciNet  MATH  Google Scholar 

  12. Daemen, J., Hoffert, S., Van Assche, G., Van Keer, R.: The design of xoodoo and xoofff. IACR Trans. Symmetric Cryptol. 2018(4), 1–38 (2018). https://tosc.iacr.org/index.php/ToSC/article/view/7359, https://doi.org/10.13154/tosc.v2018.i4.1-38

  13. Daor, J., Daemen, J., Rijmen, V.: AES proposal: Rijndael (1999). https://www.cs.miami.edu/home/burt/learning/Csc688.012/rijndael/rijndael_doc_V2.pdf

  14. Feistel, H.: Cryptography and computer privacy. Sci. Am. 228(5), 15–23 (1973)

    Article  Google Scholar 

  15. Gohr, A.: Deep speck (2019). https://github.com/agohr/deep_speck

  16. Gohr, A.: Improving attacks on round-reduced speck32/64 using deep learning. In: Boldyreva, A., Micciancio, D. (eds.) CRYPTO 2019. LNCS, vol. 11693, pp. 150–179. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26951-7_6

    Chapter  Google Scholar 

  17. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning, vol. 19. The MIT Press (2017). https://mitpress.mit.edu/books/deep-learning

  18. Gunning, D., Vorm, E., Wang, J.Y., Turek, M.: Darpa’s explainable AI (XAI) program: a retrospective. Appl. AI Lett. 2, e61 (2021). https://doi.org/10.1002/AIL2.61

  19. Gustafson, H., Dawson, E., Golić, J.D.: Automated statistical methods for measuring the strength of block ciphers. Stat. Comput. 7(2), 125–135 (1997)

    Article  Google Scholar 

  20. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2016-Decem, pp. 770–778 (2016). http://image-net.org/challenges/LSVRC/2015/, https://doi.org/10.1109/CVPR.2016.90

  21. Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. (1991). https://doi.org/10.1016/0893-6080(91)90009-T

    Article  Google Scholar 

  22. Hou, Z., Ren, J., Chen, S., Fu, A.: Improve neural distinguishers of Simon and speck. Secur. Commun. Netw. 2021 (2021). https://doi.org/10.1155/2021/9288229

  23. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018). https://doi.org/10.1109/CVPR.2018.00745

  24. Langford, S.K., Hellman, M.E.: Differential-linear cryptanalysis. In: Desmedt, Y.G. (ed.) CRYPTO 1994. LNCS, vol. 839, pp. 17–25. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-48658-5_3

    Chapter  Google Scholar 

  25. L’Ecuyer, P., Simard, R.: TestU01: a software library in ANSI C for empirical testing of random number generators, software user’s guide. Département d’Informatique et Recherche opérationnelle, Université de Montréal, Montréal, Québec, Canada (2001). http://www.iro.umontreal.ca/~simardr/TestU01.zip

  26. Livni, R., Shalev-Shwartz, S., Shamir, O.: On the computational efficiency of symmetric neural networks. Adv. Neural Inf. Process. Syst. 27, 855–863 (2014). https://papers.nips.cc/paper/2014/hash/3a0772443a0739141292a5429b952fe6-Abstract.html

  27. Makridakis, S., Spiliotis, E., Assimakopoulos, V.: The M4 competition: 100,000 time series and 61 forecasting methods. Int. J. Forecast. 36(1), 54–74 (2020). https://doi.org/10.1016/j.ijforecast.2019.04.014

    Article  Google Scholar 

  28. Moritz, P., et al.: Ray: a distributed framework for emerging \(\{\)AI\(\}\) applications. In: 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2018), pp. 561–577 (2018)

    Google Scholar 

  29. Oreshkin, B.N., Carpov, D., Chapados, N., Bengio, Y.: N-BEATS: neural basis expansion analysis for interpretable time series forecasting. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020. OpenReview.net (2020). https://openreview.net/forum?id=r1ecqn4YwB

  30. Reddi, S.J., Kale, S., Kumar, S.: On the convergence of Adam and beyond. arXiv preprint arXiv:1904.09237 (2019)

  31. Rukhin, A., et al.: A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications. NIST (2010)

    Google Scholar 

  32. Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., Guez, A., Lockhart, E., Hassabis, D., Graepel, T., Lillicrap, T., Silver, D.: Mastering Atari, Go, chess and shogi by planning with a learned model. Nature 588(7839), 604–609 (2020). https://doi.org/10.1038/s41586-020-03051-4

    Article  Google Scholar 

  33. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings (2015). http://www.robots.ox.ac.uk/

  34. Soto, J.: Randomness testing of the advanced encryption standard candidate algorithms. NIST Interagency/Internal Report (NISTIR) (1999). http://www.nist.gov/customcf/get_pdf.cfm?pub_id=151193

  35. Soto, J., Bassham, L.: Randomness testing of the advanced encryption standard finalist candidates. NIST Interagency/Internal Report (NISTIR) (2000). https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=151216

  36. Švenda, P., Ukrop, M., Matyáš, V.: Determining cryptographic distinguishers for eStream and SHA-3 candidate functions with evolutionary circuits. In: Obaidat, M.S., Filipe, J. (eds.) ICETE 2013. CCIS, vol. 456, pp. 290–305. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44788-8_17

    Chapter  Google Scholar 

  37. Team, P.: PyTorch ResNet implementation (2022). https://pytorch.org/hub/pytorch_vision_resnet/

  38. Team, R.: Ray (2022). https://github.com/ray-project/ray

  39. (TII), T.I.I.: Crypto-TII nnbits (2022). https://github.com/Crypto-TII/nnbits

  40. Virtanen, P., et al.: SciPy 1.0: fundamental algorithms for scientific computing in python. Nat. Methods 17, 261–272 (2020). https://doi.org/10.1038/s41592-019-0686-2

    Article  Google Scholar 

  41. Walker, J.: ENT: a pseudorandom number sequence test program. Web site (2008). http://www.fourmilab.ch/random/

  42. Yadav, T., Kumar, M.: Differential-ML distinguisher: machine learning based generic extension for differential cryptanalysis. In: Longa, P., Ràfols, C. (eds.) LATINCRYPT 2021. LNCS, vol. 12912, pp. 191–212. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-88238-9_10

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anna Hambitzer .

Editor information

Editors and Affiliations

Appendices

ADetails for the Generalized network Experiment

We apply Generalized network as a distinguisher to the avalanche datasets of SPECK32/64, SPECK64/128, SPECK96/144, SPECK128/128 and AES-128 in the settings , which are advantageous for machine learning. Table 6 summarizes the experimental settings for each cipher. We generate X bit sequences of the length of avalanche units for the respective cipher. A randomly chosen half of the inputs X have the label \(Y=0\) and contains random data. The other half of the inputs has the label \(Y=1\) and contains avalanche units of a cipher, that is, not random data. A Generalized network, as presented in Sect. 4.2 is trained on a subset \(X_\textrm{train}\) to predict the labels \(Y_\textrm{train}\) for 10 epochs. Subsequently, previously unseen data \(X_\textrm{test}\) is used to evaluate the accuracy A of the distinguisher.

Table 6 summarizes the avalanche unit bit sizes, the number of avalanche units for training \(X_\textrm{train}\) and testing \(X_\textrm{test}\), as well as the distinguisher’s accuracy A for relevant rounds. The accuracy is given as the mean and standard deviation over four runs of the previously described experiment.

Table 6. Accuracies A for distinguishing avalanche units of the respective cipher from random data. Bold is the first round for which the distinguisher offers no advantage over a random-guess.

B Details of NIST Results

Fig. 12.
figure 12

Randomness evaluation of rounds by NIST STS.

C Details of NNBits Results

Table 7. Summary of the number of training and testing avalanche units presented to the neural network ensemble for each cipher. The detailed training outcomes for each round are shown in Table 8.
Table 8. Detailed results for the NNBits analysis presented in Table 5. For each round r the training settings (number of epochs, number of training avalanche sequences, number of testing avalanche sequences, as well as the runtime in minutes), as well as the resulting test accuracy, p-value and randomness result is shown.

D Bit Profiles of AES-128

1.1 D.1 AES Round 1/10 Bit Pattern

The previous analysis of SPECK32/64 has shown a particular region of weak bits. In AES-128, however, we find repeating patterns of weak and strong bits in rounds 1 and 2 in the 128 bit sub-blocks of the avalanche unit.

Figure 13 shows details of the patterns observed after one round of AES-128. The complete avalanche unit of AES-128 consists of \(128 \times 128=16,384\) bits. We analyze the complete avalanche unit in blocks of 128 bits (Fig. 13a)) and can identify four recurring patterns of weak and strong bits that occur throughout the avalanche unit. For example, pattern occurs in the avalanche blocks \(s=0\) to \(s=7\) (Fig. 13b)). Exemplary sections for the distributions of weak and strong bit patterns are shown in Fig. 13c).

After one round of AES-128 96 consecutive bits of the 128 bits in each subblock can be predicted with 100% accuracy. The remaining 32 bits (4 bytes) can be predicted with less than 100% accuracy, which can be understood as follows. The round function of the AES is such that changing one byte in the input results in differences in one column of the output after one round (with the other columns remaining undisturbed). This is a well-known fact about the AES, due in particular to the MDS property of the mixcolumns operation. For the avalanche dataset, this implies that for each subblock of 128 bits (corresponding to one input difference bit), 4 bytes (one column) are nonzero, while the rest of the bytes are all zeroes.

The distribution of patterns of Fig. 13a) and Fig. 13b) is still observable after two rounds of AES (we show the equivalent 2-round patterns in the appendix Fig. 14c)). When encrypting for two rounds, each of the nonzero bytes of round 1 is sent to a different column through the shiftrows operation, and then propagated to a whole column through mixcolumns, so that after two rounds, all the bytes of the dataset are non-zero. Furthermore, there are relations between the bytes of each column: mixcolumns applies a linear transformation to a 4-byte column, and by construction, only one byte is non-zero in each column. Therefore, the resulting values are multiples (in the Galois field of AES) of a single variable, with the coefficients (2, 3, 1, 1), in an order that depends on the position of the 128-bit block in the avalanche dataset. The bytes with coefficient 1 are consistently predicted, whereas only some bits of the bytes with coefficients 2 and 3 are reliably predicted. This explains the peculiar pattern observed in the prediction, where for each group of 4 bytes, there are peaks for 2 bytes, and for some of the bits among the remaining 2 bytes.

Fig. 13.
figure 13

NNBits ensemble analysis of the avalanche sequence of AES-128. a) The avalanche block s corresponds to bits \(128\,\times \,s \ldots 128\,\times \,(s\,+\,1)\). b) Four patterns occur over the total of 16, 384 bits in one avalanche unit. c) Examples of the recurring byte patterns of weak and strong bits observed in round 1/10.

1.2 D.2 AES Round 2/10 Bit Pattern

Please see Appendix D for the context of the analysis shown in Fig. 14.

Fig. 14.
figure 14

NNBits ensemble analysis of the avalanche sequence of AES-128. a) The avalanche block s corresponds to bits \(128\,\times \, s \ldots 128\,\times \, (s\,+\,1)\). b) Four patterns occur over the total of 16, 384 bits in one avalanche unit. c) Examples of the recurring patterns of weak (green) and strong (orange) bits. The pattern is actually the same, but shifted with a starting point indicated by the black arrow. (Color figure online)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hambitzer, A., Gerault, D., Huang, Y.J., Aaraj, N., Bellini, E. (2023). NNBits: Bit Profiling with a Deep Learning Ensemble Based Distinguisher. In: Rosulek, M. (eds) Topics in Cryptology – CT-RSA 2023. CT-RSA 2023. Lecture Notes in Computer Science, vol 13871. Springer, Cham. https://doi.org/10.1007/978-3-031-30872-7_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-30872-7_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-30871-0

  • Online ISBN: 978-3-031-30872-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics