ABSTRACT
In this work, a high throughput method for generating high-quality Pseudo-Random Numbers using the bitslicing technique is proposed. In such a technique, instead of the conventional row-major data representation, column-major data representation is employed, which allows the bitslicing implementation to take full advantage of all the available datapath of the hardware platform. By employing this data representation as building blocks of algorithms, we showcase the capability and scalability of our proposed method in various PRNG methods in the category of block and stream ciphers. The LFSR-based (Linear Feedback Shift Register) nature of the PRNG in our implementation perfectly suits the GPU’s many-core structure due to its register oriented architecture. In the proposed SIMD vectorized GPU implementation, each GPU thread can generate several 32 pseudo-random bits in each LFSR clock cycle. We then compare our implementation with some of the most significant PRNGs that display a satisfactory performance throughput and randomness criteria. The proposed implementation successfully passes the NIST test for statistical randomness and bit-wise correlation criteria. For computer-based PRNG and the optical solutions in terms of performance and performance per cost, this technique is efficient while maintaining an acceptable randomness measure. Our highest performance among all of the implemented CPRNGs with the proposed method is achieved by the MICKEY 2.0 algorithm, which shows 40% improvement over state of the art NVIDIA’s proprietary high-performance PRNG, cuRAND library, achieving 2.72 Tb/s of throughput on the affordable NVIDIA GTX 2080 Ti.
- Rudolf Ahlswede and Imre Csiszár. 1993. Common randomness in information theory and cryptography. part i: secret sharing. IEEE Transactions on Information Theory 39, 4 (1993).Google ScholarDigital Library
- Armin Ahmadzadeh, Omid Hajihassani, and Saeid Gorgin. 2018. A high-performance and energy-efficient exhaustive key search approach via GPU on DES-like cryptosystems. The Journal of Supercomputing 74, 1 (2018), 160–182.Google ScholarDigital Library
- Mohammed Abdul Samad AL-khatib and Auqib Hamid Lone. 2018. Acoustic lightweight pseudo random number generator based on cryptographically secure LFSR. International Journal of Computer Network and Information Security 11, 2(2018), 38.Google Scholar
- Steve Babbage, C Canniere, Anne Canteaut, Carlos Cid, Henri Gilbert, Thomas Johansson, Matthew Parker, Bart Preneel, Vincent Rijmen, and Matthew Robshaw. 2008. The eSTREAM portfolio. eSTREAM, ECRYPT Stream Cipher Project(2008), 1–6.Google Scholar
- Steve Babbage and Matthew Dodd. 2006. The stream cipher MICKEY 2.0. ECRYPT Stream Cipher(2006).Google Scholar
- Eli Biham. 1997. A fast new DES implementation in software. In International Workshop on Fast Software Encryption. Springer, 260–272.Google ScholarCross Ref
- Kurt Binder, Dieter Heermann, Lyle Roelofs, A John Mallinckrodt, and Susan McKay. 1993. Monte Carlo simulation in statistical physics. Computers in Physics 7, 2 (1993), 156–157.Google ScholarCross Ref
- Ihsan Cicek, Ali Emre Pusane, and Gunhan Dundar. 2014. A novel design method for discrete time chaos based true random number generators. INTEGRATION, the VLSI journal 47, 1 (2014), 38–47.Google Scholar
- Anders Eklund, Paul Dufort, Daniel Forsberg, and Stephen M LaConte. 2013. Medical image processing on the GPU–Past, present and future. Medical image analysis 17, 8 (2013), 1073–1094.Google Scholar
- Shuang Gao and Gregory D Peterson. 2013. GASPRNG: GPU accelerated scalable parallel random number generator library. Computer Physics Communications 184, 4 (2013), 1241–1249.Google ScholarCross Ref
- Benedikt Gierlichs, Lejla Batina, Christophe Clavier, Thomas Eisenbarth, Aline Gouget, Helena Handschuh, Timo Kasper, Kerstin Lemke-Rust, Stefan Mangard, Amir Moradi, 2008. Susceptibility of eSTREAM candidates towards side channel analysis. (2008).Google Scholar
- Chunye Gong, Jie Liu, Lihua Chi, Qingfeng Hu, Li Deng, and Zhenghu Gong. 2010. Accelerating Pseudo-Random Number Generator for MCNP on GPU. In AIP Conference Proceedings, Vol. 1281. AIP, 1335–1337.Google Scholar
- Antonio Gulli and Sujit Pal. 2017. Deep Learning with Keras. Packt Publishing Ltd.Google Scholar
- O. Hajihassani, S. Khalaj Monfared, S. H. Khasteh, and S. Gorgin. 2019. Fast AES Implementation: A High-throughput Bitsliced Approach. IEEE Transactions on Parallel and Distributed Systems (2019), 1–1.Google Scholar
- Martin Hell, Thomas Johansson, and Willi Meier. 2007. Grain: a stream cipher for constrained environments. IJWMC 2, 1 (2007), 86–93.Google ScholarDigital Library
- Benjamin Jun and Paul Kocher. 1999. The Intel random number generator. Cryptography Research Inc. white paper 27 (1999), 1–8.Google Scholar
- Ido Kanter, Yaara Aviad, Igor Reidler, Elad Cohen, and Michael Rosenbluh. 2010. An optical ultrafast random bit generator. Nature Photonics 4, 1 (2010), 58.Google ScholarCross Ref
- Mohammad Sina Kiarostami, Mohammad Reza Daneshvaramoli, Saleh Khalaj Monfared, Dara Rahmati, and Saeid Gorgin. 2019. Multi-Agent non-Overlapping Pathfinding with Monte-Carlo Tree Search. In 2019 IEEE Conference on Games (CoG). IEEE, 1–4.Google ScholarDigital Library
- Philip Koopman. 2002. 32-bit cyclic redundancy codes for internet applications. In Proceedings International Conference on Dependable Systems and Networks. IEEE, 459–468.Google ScholarCross Ref
- William B Langdon. 2008. A fast high quality pseudo random number generator for graphics processing units. In 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence). IEEE, 459–465.Google ScholarCross Ref
- William B Langdon. 2009. A fast high quality pseudo random number generator for nVidia CUDA. In Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers. ACM, 2511–2514.Google ScholarDigital Library
- Pierre L’Ecuyer. 1990. Random numbers for simulation. Commun. ACM 33, 10 (1990), 85–97.Google ScholarDigital Library
- Pierre L’Ecuyer and Richard Simard. 2007. TestU01: AC library for empirical testing of random number generators. ACM Transactions on Mathematical Software (TOMS) 33, 4 (2007), 22.Google ScholarDigital Library
- Pu Li, Ya Guo, Yanqiang Guo, Yuanlong Fan, Xiaomin Guo, Xianglian Liu, Kunying Li, K Alan Shore, Yuncai Wang, and Anbang Wang. 2018. Ultrafast fully photonic random bit generator. Journal of Lightwave Technology 36, 12 (2018), 2531–2540.Google ScholarCross Ref
- Yang Liu, Qi Zhao, Ming-Han Li, Jian-Yu Guan, Yanbao Zhang, Bing Bai, Weijun Zhang, Wen-Zhao Liu, Cheng Wu, Xiao Yuan, 2018. Device-independent quantum random-number generation. Nature 562, 7728 (2018), 548.Google Scholar
- George Marsaglia 2003. Xorshift rngs. Journal of Statistical Software 8, 14 (2003), 1–6.Google ScholarCross Ref
- Michael Mascagni. 1999. SPRNG: A scalable library for pseudorandom number generation. In Recent Advances in Numerical Methods and Applications II. World Scientific, 284–295.Google Scholar
- Michael Mascagni and Ashok Srinivasan. 2000. Algorithm 806: SPRNG: A scalable library for pseudorandom number generation. ACM Transactions on Mathematical Software (TOMS) 26, 3 (2000), 436–461.Google ScholarDigital Library
- Makoto Matsumoto and Takuji Nishimura. 1998. Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Transactions on Modeling and Computer Simulation (TOMACS) 8, 1(1998), 3–30.Google ScholarDigital Library
- Darius Mercadier, Pierre-Évariste Dagand, Lionel Lacassagne, and Gilles Muller. 2018. Usuba: optimizing & trustworthy bitslicing compiler. In Proceedings of the 2018 4th Workshop on Programming Models for SIMD/Vector Processing. 1–8.Google ScholarDigital Library
- Nimalan Nandapalan, Richard P Brent, Lawrence M Murray, and Alistair P Rendell. 2011. High-performance pseudo-random number generation on graphics processing units. In International Conference on Parallel Processing and Applied Mathematics. Springer, 609–618.Google Scholar
- Naoki Nishikawa, Hideharu Amano, and Keisuke Iwai. 2017. Implementation of Bitsliced AES Encryption on CUDA-Enabled GPU. In Network and System Security, Zheng Yan, Refik Molva, Wojciech Mazurczyk, and Raimo Kantola (Eds.). Springer International Publishing, Cham, 273–287.Google Scholar
- Wai-Man Pang, Tien-Tsin Wong, and Pheng-Ann Heng. 2008. Generating massive high-quality random numbers using GPU. In 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence). IEEE, 841–847.Google ScholarCross Ref
- Saeid Rahmani, Armin Ahmadzadeh, Omid Hajihassani, S Mirhosseini, and Saeid Gorgin. 2016. An efficient multi-core and many-core implementation of k-means clustering. In ACM-IEEE International Conference on Formal Methods and Models for System Design (MEMOCODE). 128–131.Google Scholar
- Vincent Rijmen and Joan Daemen. 2001. Advanced encryption standard. Proceedings of Federal Information Processing Standards Publications, National Institute of Standards and Technology(2001), 19–22.Google Scholar
- Andrew Rukhin, Juan Soto, James Nechvatal, Miles Smid, and Elaine Barker. 2001. A statistical test suite for random and pseudorandom number generators for cryptographic applications. Technical Report. Booz-Allen and Hamilton Inc Mclean Va.Google Scholar
- Guido Di Patrizio Stanchieri, Andrea De Marcellis, Elia Palange, and Marco Faccio. 2019. A True Random Number Generator Architecture Based on a Reduced Number of FPGA Primitives. AEU-International Journal of Electronics and Communications (2019).Google Scholar
- Berk Sunar. 2009. True random number generators for cryptography. In Cryptographic Engineering. Springer, 55–73.Google Scholar
- Myles Sussman, William Crutchfield, and Matthew Papakipos. 2006. Pseudorandom number generation on the GPU. In Proceedings of the 21st ACM SIGGRAPH/EUROGRAPHICS symposium on Graphics hardware. ACM, 87–94.Google ScholarDigital Library
- Robert Szerwinski and Tim Güneysu. 2008. Exploiting the power of GPUs for asymmetric cryptography. In International Workshop on Cryptographic hardware and embedded systems. Springer, 79–99.Google ScholarDigital Library
- Je Sen Teh, Azman Samsudin, Mishal Al-Mazrooie, and Amir Akhavan. 2015. GPUs and chaos: a new true random number generator. Nonlinear Dynamics 82, 4 (2015), 1913–1922.Google ScholarCross Ref
- NVIDIA Corporation. [n.d.]. The NVIDIA CUDA Random Number Generation library (cuRAND). https://developer.nvidia.com/curandGoogle Scholar
- David Barrie Thomas, Lee Howes, and Wayne Luk. 2009. A comparison of CPUs, GPUs, FPGAs, and massively parallel processor arrays for random number generation. In Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays. ACM, 63–72.Google ScholarDigital Library
- John von Neumann. 1963. Various techniques used in connection with random digits. John von Neumann, Collected Works 5 (1963), 768–770.Google Scholar
- Hesong Xu, Nicola Massari, Leonardo Gasparini, Alessio Meneghetti, and Alessandro Tomasi. 2019. A SPAD-based random number generator pixel based on the arrival time of photons. Integration 64(2019), 22–28.Google ScholarCross Ref
Recommendations
A fast high quality pseudo random number generator for nVidia CUDA
GECCO '09: Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking PapersPreviously either due to hardware GPU limits or older versions of software, careful implementation of PRNGs was required to make good use of the limited numerical precision available on graphics cards. Newer nVidia G80 and Tesla hardware support double ...
Fast implementation of block ciphers and PRNGs in Maxwell GPU architecture
GPU is widely used in various applications that require huge computational power. In this paper, we contribute to the cryptography and high performance computing research community by presenting techniques to accelerate symmetric block ciphers (AES-128, ...
Fast and small nonlinear pseudorandom number generators for computer simulation
PPAM'11: Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part IIn this paper we present Tyche, a nonlinear pseudorandom number generator designed for computer simulation. Tyche has a small 128-bit state and an expected period length of 2127. Unlike most nonlinear generators, Tyche is consistently fast across ...
Comments