ABSTRACT
In this paper, we compare the radiation response of GPUs executing matrix multiplication and FFT algorithms. The provided experimental results demonstrate that for both algorithms, in the majority of cases, the output is affected by multiple errors. The architectural and code analysis highlight that multiple errors are caused by shared resources corruption or thread dependencies. The experimental data and analytical studies can be fruitfully employed to evaluate the expected error rate of GPUs in realistic applications and to design specific and optimized software-based hardening procedures.
- J.D. Owens, M. Houston, D. Luebke, S. Green, J.E. Stone, and J.C. Phillips, "GPU Computing" Proceedings of the IEEE, vol.96, no.5, pp.879--899, May 2008.Google ScholarCross Ref
- E. Lindholm, J. Nickolls, S. Oberman, and J. Montrym, "NVIDIA Tesla: A Unified Graphics and Computing Architecture" IEEE MICRO, vol. 28, n. 2, March/April 2008, pp. 39--55. Google ScholarDigital Library
- J. Kruger and R. Westermann, "Linear Algebra operators for GPU implementation of numerical algorithms", ACM Trans. Graph. n. 22, vol. 3, 2003, pp. 908--916. Google ScholarDigital Library
- J. Liepe, C. Barnes, E. Cule, K. Erguler, P. Kirk, T. Toni, and M. P. H. Stumpf, "ABC-SysBio-approximate Bayesian computation in Python with GPU support" -- Bioinformatics, vol. 26, n. 14, July 2012, pp. 1797--1799. Google ScholarDigital Library
- Introducing Titan, www.olcf.ornl.gov/titan.Google Scholar
- P. Rech, C. Aguiar, R. Ferreira, M. Silvestri, A. Griffoni, C. Frost, and L. Carro, "Neutron-Induced Soft Error in Graphic Processing Units", in proc. IEEE REDW 2012, Miami, FL, USA.Google ScholarCross Ref
- P. Rech, C. Aguiar, C. Frost, and L. Carro, "Neutron Radiation Test of Graphic Processing Units", in proc. IEEE IOLTS 2012, Sitges, Spain. Google ScholarDigital Library
- N. Seifert, Zhu Xiaowei, and L. W. Massengill, "Impact of Scaling on Soft-Error Rates in Commercial Microprocessors", IEEE Trans. Nucl. Sci, vol. 46, no. 6, pp. 3100, 2002, 3106.Google ScholarCross Ref
- H.T. Nguyen, Y. Yagil, N. Seifert, and M. Reitsma, "Chip-level Soft Error Estimation Method", IEEE Trans. Device and Materials Reliability, vol. 5, no. 3, 2005, pp. 356, 381.Google ScholarCross Ref
- P. Rech, C. Aguiar, C. Frost, and L. Carro, "Experimental Evaluation of Software Hardening Techniques for GPUs", in proc. IEEE RADECS 2012, Bordeaux, France.Google Scholar
- D. B. Kirk, W.W. Hwo, "Programming Massively Parallel Processors", MK Publishers. Google ScholarDigital Library
- NVIDIA GeForce GTX 480/470/465 GPU DatasheetGoogle Scholar
- NVIDIA Tesla C2050/C2075 GPU DatasheetGoogle Scholar
- M. Violante, et al., "A New Hardware/Software Platform and a New 1/E Neutron Source for Soft Error Studies: Testing FPGAs at the ISIS Facility", IEEE Trans. Nucl. Sci., vol. 54, no. 4, pp. 1184--1189.Google ScholarCross Ref
- R.C. Baumann, "Neutron-induced boron fission as a major source of soft errors in deep submicron SRAM devices", in proc. IEEE IRPS 2000, pp. 152--157.Google ScholarCross Ref
- P. Rech, C. Aguiar, C. Frost, and L. Carro, "Experimental Evaluation of Thread Distribution Effects on Multiple Output Errors in GPUs", in proc. IEEE ETS 2013, Avignon, FranceGoogle ScholarCross Ref
- E. Normand, "Single Event Effects in Avionics", IEEE Trans. Nucl. Sci., Vol. 43, No. 2, Apr. 1996, pp. 461--474.Google ScholarCross Ref
- NVIDIA BENCH: Tesla C2050 Performance BenchmarksGoogle Scholar
- K.H. Huang and J.A. Abraham, "Algorithm-Based Fault Tolerance for Matrix Operations", IEEE Trans. on Computers, vol. c-33, no. 6, June 1984, pp. 518--528. Google ScholarDigital Library
- R. Freivalds, Fast Probabilistic Algorithms, In Matematical Formulations of CS, Lecture notes in Computer Science, vol. 74, 1979, pp. 57--69.Google Scholar
- D. Bailey, et al., "The NAS Parallel Benchmarks", RNR Technical Report RNR-94-007, March 1994.Google Scholar
- T. G. Stockham, "High-Speed Convolution and Correlation", in proc. Spring Joint Computer Conference, 1966, pp. 229--233. Google ScholarDigital Library
- S. Caminiti, I. Finocchi, E. G. Fusco, and F. Silvestri, "Dynamic programming in faulty memory hierarchies (cache-obliviously)", in proc. of 31st FSTTCS, LIPIcs 13, pp. 433--444.Google Scholar
- R. M. Karp and M. O. Rabin, "Efficient randomized pattern-matching algorithms", IBM J. Res. Dev., 1987, vol. 31, no. 2, pp. 249--260. Google ScholarDigital Library
Index Terms
- Neutron sensitivity and software hardening strategies for matrix multiplication and FFT on graphics processing units
Recommendations
An Effective Approach for Implementing Sparse Matrix-Vector Multiplication on Graphics Processing Units
HPCC '12: Proceedings of the 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and SystemsSparse matrix vector multiplication, SpMV, is often a performance bottleneck in iterative solvers. Recently, Graphics Processing Units, GPUs, have been deployed to enhance the performance of this operation. We present a blocked version of the Transposed ...
Optimized Software-Based Hardening Strategies for Matrix Multiplication and Fast Fourier Transform
ICACS '18: Proceedings of the 2nd International Conference on Algorithms, Computing and SystemsNowadays, Graphics Processing Unit (GPU) has shown great potential in High-Performance Computing applications for its parallel computing structures, which can greatly accelerate the computing process. However, GPU reliability is critical in some ...
Improving Performance of Matrix Multiplication and FFT on GPU
ICPADS '09: Proceedings of the 2009 15th International Conference on Parallel and Distributed SystemsIn this paper we discuss about our experiences in improving the performance of two key algorithms: the single-precision matrix-matrix multiplication subprogram (SGEMM of BLAS) and single-precision FFT using CUDA. The former is computation-intensive, ...
Comments