EAIBench: An Energy Efficiency Benchmark for AI Training

Zhang, Fan; Lan, Chuanxin; Wang, Lei; Tang, Fei; Dai, Shaopeng; Wang, Jiangtao; Ma, Jiantao; Zhan, Jianfeng

doi:10.1007/978-3-031-31180-2_2

Fan Zhang¹⁰,
Chuanxin Lan¹⁰,
Lei Wang^10,11,12,
Fei Tang^10,12,
Shaopeng Dai^10,11,
Jiangtao Wang¹³,
Jiantao Ma¹³ &
…
Jianfeng Zhan^10,11,12

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13852))

Included in the following conference series:

International Symposium on Benchmarking, Measuring and Optimization

356 Accesses

Abstract

The increase in computing power has prompted more considerable artificial intelligence (AI) model scales. From 341K multiply-accumulate operations (MACs) of LeNet-5 to 4.11G MACs of ResNet-50, the computational cost of image classification has increased by 10,000 times over two decades. On the other hand, it has inevitably brought about an increase in energy consumption, and benchmarking the energy efficiency of the modern AI workloads is also essential. Existing benchmarks, such as MLPerf and AIBench, focus on performance evaluation of AI computing, the time to the target accuracy (TTA) is the primary metric. Corresponding to the TTA metric, using the energy consumption, where the AI workload achieves the specific accuracy, is a straightforward energy measurement method. However, it is too time-consuming and power-hungry, which is unacceptable for energy efficiency benchmarking. This work introduces a new metric to quickly and accurately benchmark AI training workloads’ energy efficiency, called the Energy-Delay Product of one Epoch (EEDP). The EEDP is calculated based on the product of the energy and time consumption within one training epoch, where one epoch refers to one training cycle through the entire training dataset. It can reflect not only the energy consumption but also the time efficiency and suit the energy efficiency of the AI training workloads. Then, we introduce an AI training energy efficiency benchmark named EAIBench, which covers different energy efficiency dimensions, including dominant layers, computation intensities, and memory accesses. Our evaluation results demonstrate that EAIBench can provide reproducible and meaningful results in only dozens of minutes, which is hundreds of times faster than the existing AI training benchmark method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

https://nlp.stanford.edu/projects/nmt/
Adolf, R., Rama, S., Reagen, B., Wei, G.Y., Brooks, D.: Fathom: reference workloads for modern deep learning methods. In: 2016 IEEE International Symposium on Workload Characterization (IISWC), pp. 1–10. IEEE (2016)
Google Scholar
Akiba, T., Suzuki, S., Fukuda, K.: Extremely large minibatch SGD: training ResNet-50 on ImageNet in 15 minutes. arXiv preprint arXiv:1711.04325 (2017)
Amodei, D., et al.: Deep speech 2: end-to-end speech recognition in English and Mandarin. In: International Conference on Machine Learning, pp. 173–182. PMLR (2016)
Google Scholar
Baidu: Deepbench: benchmarking deep learning operations on different hardware (2017). https://github.com/baidu-research/DeepBench
Cho, E., Myers, S.A., Leskovec, J.: Friendship and mobility: user movement in location-based social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1082–1090 (2011)
Google Scholar
Coleman, C., et al.: Analysis of dawnbench, a time-to-accuracy machine learning performance benchmark. ACM SIGOPS Oper. Syst. Rev. 53(1), 14–25 (2019)
Article Google Scholar
Coleman, C., et al.: Dawnbench: an end-to-end deep learning benchmark and competition. Training 100(101), 102 (2017)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
Gonzalez, R., Horowitz, M.: Energy dissipation in general purpose microprocessors. IEEE J. Solid-State Circuits 31(9), 1277–1284 (1996)
Article Google Scholar
Goyal, P., et al.: Accurate, large minibatch SGD: training ImageNet in 1 hour. arXiv preprint arXiv:1706.02677 (2017)
Hajiamini, S., Shirazi, B.A.: A study of DVFS methodologies for multicore systems with islanding feature. In: Advances in Computers, vol. 119, pp. 35–71. Elsevier (2020)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
He, X., Liao, L., Zhang, H., Nie, L., Hu, X., Chua, T.S.: Neural collaborative filtering. In: Proceedings of the 26th International Conference on World Wide Web, pp. 173–182 (2017)
Google Scholar
He, X., et al.: Practical lessons from predicting clicks on ads at Facebook. In: Proceedings of the Eighth International Workshop on Data Mining for Online Advertising, pp. 1–9 (2014)
Google Scholar
Henning, J.L.: SPEC CPU2006 benchmark descriptions. ACM SIGARCH Comput. Archit. News 34(4), 1–17 (2006)
Article Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, Y., Wei, X., Xiao, J., Liu, Z., Xu, Y., Tian, Y.: Energy consumption and emission mitigation prediction based on data center traffic and PUE for global data centers. Global Energy Interconnection 3(3), 272–282 (2020)
Article Google Scholar
Mattson, P., et al.: MLPerf training benchmark. Proc. Mach. Learn. Syst. 2, 336–349 (2020)
Google Scholar
Miller, R.: The sustainability imperative: green data centers and our cloudy future. Tech. Rep., Data Center Frontier (2020)
Google Scholar
Naumov, M., et al.: Deep learning recommendation model for personalization and recommendation systems. arXiv preprint arXiv:1906.00091 (2019)
NVIDIA: https://docs.nvidia.com/cuda/profiler-users-guide/index.html
NVIDIA: Nvidia deeplearningexamples (2019). https://github.com/NVIDIA/DeepLearningExamples
Panayotov, V., Chen, G., Povey, D., Khudanpur, S.: LibriSpeech: an ASR corpus based on public domain audio books. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5206–5210. IEEE (2015)
Google Scholar
Shi, S., Wang, Q., Xu, P., Chu, X.: Benchmarking state-of-the-art deep learning software tools. In: 2016 7th International Conference on Cloud Computing and Big Data (CCBD), pp. 99–104. IEEE (2016)
Google Scholar
Tang, F., et al.: AIBench training: balanced industry-standard AI training benchmarking. In: 2021 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 24–35. IEEE (2021)
Google Scholar
Tang, J., Wang, K.: Ranking distillation: learning compact ranking models with high performance for recommender system. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2289–2298 (2018)
Google Scholar
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Google Scholar
Wang, Y., et al.: Benchmarking the performance and energy efficiency of AI accelerators for AI training. In: 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), pp. 744–751. IEEE (2020)
Google Scholar
Yao, C., et al.: Evaluating and analyzing the energy efficiency of CNN inference on high-performance GPU. Concurr. Comput. Pract. Exp. 33(6), e6064 (2021)
Article Google Scholar
Zhu, H., et al.: TBD: benchmarking and analyzing deep neural network training. arXiv preprint arXiv:1803.06905 (2018)

Download references

Author information

Authors and Affiliations

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
Fan Zhang, Chuanxin Lan, Lei Wang, Fei Tang, Shaopeng Dai & Jianfeng Zhan
International Open Benchmark Council (BenchCouncil), Beijing, China
Lei Wang, Shaopeng Dai & Jianfeng Zhan
School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, 100049, China
Lei Wang, Fei Tang & Jianfeng Zhan
Huawei, Shenzhen, China
Jiangtao Wang & Jiantao Ma

Authors

Fan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chuanxin Lan
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Fei Tang
View author publications
You can also search for this author in PubMed Google Scholar
Shaopeng Dai
View author publications
You can also search for this author in PubMed Google Scholar
Jiangtao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jiantao Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jianfeng Zhan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianfeng Zhan .

Editor information

Editors and Affiliations

Oak Ridge National Laboratory, Oak Ridge, TN, USA
Ana Gainaru
ETH Zurich, Zürich, Switzerland
Ce Zhang
Chinese Academy of Sciences, Beijing, China
Chunjie Luo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, F. et al. (2023). EAIBench: An Energy Efficiency Benchmark for AI Training. In: Gainaru, A., Zhang, C., Luo, C. (eds) Benchmarking, Measuring, and Optimizing. Bench 2022. Lecture Notes in Computer Science, vol 13852. Springer, Cham. https://doi.org/10.1007/978-3-031-31180-2_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-31180-2_2
Published: 13 May 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31179-6
Online ISBN: 978-3-031-31180-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

EAIBench: An Energy Efficiency Benchmark for AI Training