ABSTRACT
GPUs have been employed in the embedded systems to handle increased amount of computation and satisfy the timing requirement. Therefore, the lifetime of embedded GPUs is considered one of the most important aspects to ensure functional correctness over a long period of time. Moreover, existing state-of-the-art compiler-based GPU aging management techniques suffer from a considerable amount of performance overhead. In this paper, we propose a low-overhead aging-aware resource management technique. The proposed technique extends the behavior of the existing warp scheduler and the instruction dispatcher to cluster the computational cores and distribute instructions based on the aging information. Compared to when using the original applications, our technique improves the aging of the embedded GPU by 30% on average. In addition, compared to the state-of-the-art GPU aging management technique, our technique reduces the performance overhead by 16.4% on average while improving the aging by 3% on average.
- A. Bakhoda, G. Yuan, W. Fung, H. Wong, and T. Aamodt. "Analyzing CUDA workloads using a detailed GPU simulator". IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS'09), pages 163--174, 2009.Google ScholarCross Ref
- M. Bandan, S. Bhattacharjee, R. Shafik, D. Pradhan, and J. Mathew. "Lifetime Reliability-Aware Checkpointing Mechanism: Modelling and Analysis". 2013 International Symposium on Electronic System Design (ISED'13), pages 128--132, 2013. Google ScholarDigital Library
- X. Chen, Y. Wang, Y. Liang, Y. Xie, and H. Yang. "Run-time technique for simultaneous aging and power optimization in GPGPUs". Proceedings of the 51th Annual Design Automation Conference (DAC'14), pages 1--6, 2014. Google ScholarDigital Library
- D. Defour and E. Petit. "GPUburn: A system to test and mitigate GPU hardware failures". 2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XIII'13), pages 263--270, 2013.Google ScholarCross Ref
- D. Gnad, M. Shafique, F. Kriebel, S. Rehman, D. Sun, and J. Henkel. Hayat: Harnessing dark silicon and variability for aging deceleration and balancing. Proceedings of the 52nd Annual Design Automation Conference (DAC'15), pages 1--6, 2015. Google ScholarDigital Library
- I. Haque and V. Pande. "Hard Data on Soft Errors: A Large-Scale Assessment of Real-World Error Rates in GPGPU". 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid'10), pages 691--696, 2010. Google ScholarDigital Library
- S. Kiamehr, F. Firouzi, and M. Tahoori. "Input and Transistor Reordering for NBTI and HCI Reduction in Complex CMOS Gates". Proceedings of the Great Lakes Symposium on VLSI (GLSVLSI'12), pages 201--206, 2012. Google ScholarDigital Library
- F. Kriebel, S. Rehman, M. Shafique, and J. Henkel. ageopt-rmt: Compiler-driven variation-aware aging optimization for redundant multithreading. 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC'16), pages 1--6, June 2016. Google ScholarDigital Library
- H. Lee, H. Chen, and M. A. A. Faruque. "PAIS: Parallelization aware instruction scheduling for improving soft-error reliability of GPU-based systems". Design, Automation Test in Europe Conference Exhibition (DATE'16), pages 68--73, 2016. Google ScholarDigital Library
- H. Lee and M. A. A. Faruque. "GPU-EvR: Run-Time Event Based Real-Time Scheduling Framework on GPGPU Platform". Design, Automation and Test in Europe Conference and Exhibition (DATE'14), pages 1--6, 2014. Google ScholarDigital Library
- H. Lee and M. A. A. Faruque. "Run-Time Scheduling Framework for Event-Driven Applications on a GPU-Based Embedded System". IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 35(12):1956--1967, 2016. Google ScholarDigital Library
- A. Lotfi, A. Rahimi, L. Benini, and R. Gupta. "Aging-Aware Compilation for GP-GPUs". ACM Transactions on Architecture and Code Optimization (TACO'15), pages 1--20, 2015. Google ScholarDigital Library
- D. Luebke, M. Harris, N. Govindaraju, A. Lefohn, M. Houston, J. Owens, M. Segal, M. Papakipos, and I. Buck. "GPGPU: general purpose computation on graphics hardware". ACM SIGGRAPH 2004 Course Notes (SIGGRAPH '04), 2004. Google ScholarDigital Library
- S. Mu, C. Wang, M. Liu, D. Li, M. Zhu, X. Chen, X. Xie, and Y. Deng. "Evaluating the potential of graphics processors for high performance embedded computing". Design, Automation Test in Europe Conference Exhibition (DATE'11), pages 1--6, 2011.Google Scholar
- M. Namaki-Shoushtari, A. Rahimi, N. Dutt, P. Gupta, and R. K. Gupta. "ARGO: Aging-aware GPGPU Register File Allocation". Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'13), pages 1--9, 2013. Google ScholarDigital Library
- NVIDIA. "CUDA C Programming Guide". http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html, 2012.Google Scholar
- NVIDIA. "NVIDIA's next generation CUDA compute architecture: Kepler GK110". 2012.Google Scholar
- NVIDIA. "NVIDIA Jetson TK1 Development Kit Bringing GPU-accelerated computing to Embedded Systems". 2014.Google Scholar
- F. Oboril and M. B. Tahoori. "ExtraTime: Modeling and analysis of wearout due to transistor aging at microarchitecture-level". IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'12), pages 1--12, 2012. Google ScholarDigital Library
- F. Paterna, A. Acquaviva, and L. Benini. "Aging-Aware Energy-Efficient Workload Allocation for Mobile Multimedia Platforms". IEEE Transactions on Parallel and Distributed Systems, pages 89--99, 2013. Google ScholarDigital Library
- A. Rahimi, L. Benini, and R. Gupta. "Aging-aware Compiler-directed VLIW Assignment for GPGPU Architectures". Proceedings of the 50th Annual Design Automation Conference (DAC'13), pages 1--6, 2013. Google ScholarDigital Library
- S. Rehman, F. Kriebel, D. Sun, M. Shafique, and J. Henkel. "dTune: Leveraging Reliable Code Generation for Adaptive Dependability Tuning Under Process Variation and Aging-Induced Effects". Proceedings of the 51st Annual Design Automation Conference (DAC'14), pages 1--6, 2014. Google ScholarDigital Library
- P. Singh, E. Karl, D. Blaauw, and D. Sylvester. "Compact Degradation Sensors for Monitoring NBTI and Oxide Degradation". IEEE Transactions on Very Large Scale Integration (VLSI) Systems, pages 1645--1655, 2012. Google ScholarDigital Library
- J. Sun, R. Zheng, J. Velamala, Y. Cao, R. Lysecky, K. Shankar, and J. Roveda. "A Self-tuning Design Methodology for Power-efficient Multi-core Systems". ACM Transactions on Design Automation of Electronic Systems (TODAES), pages 1--24, 2013. Google ScholarDigital Library
- C. Thompson, S. Hahn, and M. Oskin. "Using modern graphics architectures for general-purpose computing: a framework and analysis". International Symposium on Microarchitecture (MICRO-35), pages 306--317, 2002. Google ScholarDigital Library
- Q. Xu and M. Annavaram. "PATS: Pattern Aware Scheduling and Power Gating for GPGPUs". pages 225--236, 2014. Google ScholarDigital Library
- Y. Zhang, S. Chen, L. Peng, and S. Chen. "NBTI alleviation on FinFET-made GPUs by utilizing device heterogeneity". Integration, the VLSI Journal, 51:10--20, 2015. Google ScholarDigital Library
Recommendations
Dynamic Resource Management for Efficient Utilization of Multitasking GPUs
ASPLOS '17: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating SystemsAs graphics processing units (GPUs) are broadly adopted, running multiple applications on a GPU at the same time is beginning to attract wide attention. Recent proposals on multitasking GPUs have focused on either spatial multitasking, which partitions ...
Aging-Aware Workload Management on Embedded GPU Under Process Variation
Graphics Processing Units (GPUs) have been employed in embedded systems to handle increased amounts of computation and to satisfy the timing requirement. Due to the small feature size, chip aging and within-die parameter variations have been considered ...
Aging-Aware Compilation for GP-GPUs
General-purpose graphic processing units (GP-GPUs) offer high computational throughput using thousands of integrated processing elements (PEs). These PEs are stressed during workload execution, and negative bias temperature instability (NBTI) adversely ...
Comments