skip to main content
10.1145/3061639.3062277acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Low-overhead Aging-aware Resource Management on Embedded GPUs

Authors Info & Claims
Published:18 June 2017Publication History

ABSTRACT

GPUs have been employed in the embedded systems to handle increased amount of computation and satisfy the timing requirement. Therefore, the lifetime of embedded GPUs is considered one of the most important aspects to ensure functional correctness over a long period of time. Moreover, existing state-of-the-art compiler-based GPU aging management techniques suffer from a considerable amount of performance overhead. In this paper, we propose a low-overhead aging-aware resource management technique. The proposed technique extends the behavior of the existing warp scheduler and the instruction dispatcher to cluster the computational cores and distribute instructions based on the aging information. Compared to when using the original applications, our technique improves the aging of the embedded GPU by 30% on average. In addition, compared to the state-of-the-art GPU aging management technique, our technique reduces the performance overhead by 16.4% on average while improving the aging by 3% on average.

References

  1. A. Bakhoda, G. Yuan, W. Fung, H. Wong, and T. Aamodt. "Analyzing CUDA workloads using a detailed GPU simulator". IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS'09), pages 163--174, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  2. M. Bandan, S. Bhattacharjee, R. Shafik, D. Pradhan, and J. Mathew. "Lifetime Reliability-Aware Checkpointing Mechanism: Modelling and Analysis". 2013 International Symposium on Electronic System Design (ISED'13), pages 128--132, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. X. Chen, Y. Wang, Y. Liang, Y. Xie, and H. Yang. "Run-time technique for simultaneous aging and power optimization in GPGPUs". Proceedings of the 51th Annual Design Automation Conference (DAC'14), pages 1--6, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Defour and E. Petit. "GPUburn: A system to test and mitigate GPU hardware failures". 2013 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS XIII'13), pages 263--270, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  5. D. Gnad, M. Shafique, F. Kriebel, S. Rehman, D. Sun, and J. Henkel. Hayat: Harnessing dark silicon and variability for aging deceleration and balancing. Proceedings of the 52nd Annual Design Automation Conference (DAC'15), pages 1--6, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. I. Haque and V. Pande. "Hard Data on Soft Errors: A Large-Scale Assessment of Real-World Error Rates in GPGPU". 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid'10), pages 691--696, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Kiamehr, F. Firouzi, and M. Tahoori. "Input and Transistor Reordering for NBTI and HCI Reduction in Complex CMOS Gates". Proceedings of the Great Lakes Symposium on VLSI (GLSVLSI'12), pages 201--206, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. F. Kriebel, S. Rehman, M. Shafique, and J. Henkel. ageopt-rmt: Compiler-driven variation-aware aging optimization for redundant multithreading. 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC'16), pages 1--6, June 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. H. Lee, H. Chen, and M. A. A. Faruque. "PAIS: Parallelization aware instruction scheduling for improving soft-error reliability of GPU-based systems". Design, Automation Test in Europe Conference Exhibition (DATE'16), pages 68--73, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. H. Lee and M. A. A. Faruque. "GPU-EvR: Run-Time Event Based Real-Time Scheduling Framework on GPGPU Platform". Design, Automation and Test in Europe Conference and Exhibition (DATE'14), pages 1--6, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Lee and M. A. A. Faruque. "Run-Time Scheduling Framework for Event-Driven Applications on a GPU-Based Embedded System". IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 35(12):1956--1967, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Lotfi, A. Rahimi, L. Benini, and R. Gupta. "Aging-Aware Compilation for GP-GPUs". ACM Transactions on Architecture and Code Optimization (TACO'15), pages 1--20, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. D. Luebke, M. Harris, N. Govindaraju, A. Lefohn, M. Houston, J. Owens, M. Segal, M. Papakipos, and I. Buck. "GPGPU: general purpose computation on graphics hardware". ACM SIGGRAPH 2004 Course Notes (SIGGRAPH '04), 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. Mu, C. Wang, M. Liu, D. Li, M. Zhu, X. Chen, X. Xie, and Y. Deng. "Evaluating the potential of graphics processors for high performance embedded computing". Design, Automation Test in Europe Conference Exhibition (DATE'11), pages 1--6, 2011.Google ScholarGoogle Scholar
  15. M. Namaki-Shoushtari, A. Rahimi, N. Dutt, P. Gupta, and R. K. Gupta. "ARGO: Aging-aware GPGPU Register File Allocation". Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'13), pages 1--9, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. NVIDIA. "CUDA C Programming Guide". http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html, 2012.Google ScholarGoogle Scholar
  17. NVIDIA. "NVIDIA's next generation CUDA compute architecture: Kepler GK110". 2012.Google ScholarGoogle Scholar
  18. NVIDIA. "NVIDIA Jetson TK1 Development Kit Bringing GPU-accelerated computing to Embedded Systems". 2014.Google ScholarGoogle Scholar
  19. F. Oboril and M. B. Tahoori. "ExtraTime: Modeling and analysis of wearout due to transistor aging at microarchitecture-level". IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'12), pages 1--12, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. F. Paterna, A. Acquaviva, and L. Benini. "Aging-Aware Energy-Efficient Workload Allocation for Mobile Multimedia Platforms". IEEE Transactions on Parallel and Distributed Systems, pages 89--99, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Rahimi, L. Benini, and R. Gupta. "Aging-aware Compiler-directed VLIW Assignment for GPGPU Architectures". Proceedings of the 50th Annual Design Automation Conference (DAC'13), pages 1--6, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. S. Rehman, F. Kriebel, D. Sun, M. Shafique, and J. Henkel. "dTune: Leveraging Reliable Code Generation for Adaptive Dependability Tuning Under Process Variation and Aging-Induced Effects". Proceedings of the 51st Annual Design Automation Conference (DAC'14), pages 1--6, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. P. Singh, E. Karl, D. Blaauw, and D. Sylvester. "Compact Degradation Sensors for Monitoring NBTI and Oxide Degradation". IEEE Transactions on Very Large Scale Integration (VLSI) Systems, pages 1645--1655, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J. Sun, R. Zheng, J. Velamala, Y. Cao, R. Lysecky, K. Shankar, and J. Roveda. "A Self-tuning Design Methodology for Power-efficient Multi-core Systems". ACM Transactions on Design Automation of Electronic Systems (TODAES), pages 1--24, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. C. Thompson, S. Hahn, and M. Oskin. "Using modern graphics architectures for general-purpose computing: a framework and analysis". International Symposium on Microarchitecture (MICRO-35), pages 306--317, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Q. Xu and M. Annavaram. "PATS: Pattern Aware Scheduling and Power Gating for GPGPUs". pages 225--236, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Y. Zhang, S. Chen, L. Peng, and S. Chen. "NBTI alleviation on FinFET-made GPUs by utilizing device heterogeneity". Integration, the VLSI Journal, 51:10--20, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    DAC '17: Proceedings of the 54th Annual Design Automation Conference 2017
    June 2017
    533 pages
    ISBN:9781450349277
    DOI:10.1145/3061639

    Copyright © 2017 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 18 June 2017

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate1,770of5,499submissions,32%

    Upcoming Conference

    DAC '24
    61st ACM/IEEE Design Automation Conference
    June 23 - 27, 2024
    San Francisco , CA , USA

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader