Abstract
AI is continuing to emerge as an important workload across enterprise and academia. Benchmarking is an essential tool to understand its computational requirements and to evaluate performance of different types of accelerators available for AI. However, benchmarking AI inference is complicated as one needs to balance between throughput, latency, and efficiency. Here we survey current state of the field and analyze MLPerf Inference results, which represent the most comprehensive inference performance data available. Additionally, we present our own experience in AI inference benchmarking along with lessons learned in the process. Finally, we offer suggestions for the future we would like to see in AI benchmarking from a point of view of a datacenter server vendor.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Heber, F., et al.: Sockeye: a toolkit for neural machine translation. arXiv:1712.05690 (2017)
Amondei, D., et al.: Deep Speech 2: end-to-end speech recognition in english and mandarin. In: International Conference on Machine Learning, pp. 173–182 (2016)
Han, S., Mao, H., Dally, W.J.: Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv:1510.00149 (2015)
Kanter, D.: Real World Technologies, 25 November 2019. https://www.realworldtech.com/sc19-hpc-meets-machine-learning/
Blalock, D., Ortiz, J.J.G., Franle, J., Guttag, J.: What is the state of Neural Network Pruning. In: Proceedings of the 3rd MLSys Conference (2020)
Bhandare, A., et al.: Efficient 8-bit quantization of transformer neual machine langage translation model. In: 36th International Conference on Machine Learning (2019)
Sung, W., Shin, S., Hwang, K.: Resiliency of deep neural networks under quantization. arXiv:1511.06488 (2016)
Bourrasset, C., et al.: Requirements for an enterprise AI benchmark. In: Nambiar, R., Poess, M. (eds.) TPCTC 2018. LNCS, vol. 11135, pp. 71–81. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11404-6_6
Bench Research: Deep Bench. https://github.com/baidu-research/DeepBench
Coleman, C.A., et al.: DAWNBench: an end-to-end deep learning benchmark and competition. In: Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017) (2017)
MLPerf. https://www.mlperf.org/
Reddy, V.J., et al.: MLPerf Inference Benchmark. arXiv preprint arXiv:1911:02549 (2019)
Nambiar, R., Ghandeharizadeh, S., Little, G., Boden, C., Dholakia, A.: Industry panel on defining industry standards for benchmarking artificial intelligence. In: Nambiar, R., Poess, M. (eds.) TPCTC 2018. LNCS, vol. 11135, pp. 1–6. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11404-6_1
TPC Press Release: Transaction Processing Performance Council (TPC) Establishes Artificial Intelligence Working Group (TPC-AI) (2017). https://www.businesswire.com/news/home/20171212005281/en/Transaction-Processing-Performance-Council-Establishes-Artificial
Rabl, T., et al.: ADABench - Towards an industry standard benchmark for advanced analytics. In: Nambiar, R., Poess, M. (eds.) TPCTC 2019. LNCS, vol. 12257, pp. 47–63. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55024-0_4
Hodak, M., Ellison, D., Seidel, P., Dholakia, A.: Performance implications of big data in scalable deep learning: on the importance of bandwidth and caching. In: 2018 IEEE International Conference on Big Data, pp. 1945–1950 (2018). https://doi.org/10.1109/BigData.2018.8621896
Hodak, M., Dholakia, A.: Towards evaluation of tensorflow performance in a distributed compute environment. In: Nambiar, R., Poess, M. (eds.) TPCTC 2018. LNCS, vol. 11135, pp. 82–93. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-11404-6_7
Hodak, M., Gorkovenko, M., Dholakia, A.: Towards power efficiency in deep learning on data center hardware. In: 2019 IEEE International Conference on Big Data, pp. 1814–1820 (2019). https://doi.org/10.1109/BigData47090.2019.9005632
Hodak, M., Dholakia, A.: Challenges in distributed MLPerf. In: Nambiar, R., Poess, M. (eds.) TPCTC 2019. LNCS, vol. 12257, pp. 39–46. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-55024-0_3
MLPerf: MLBox. https://github.com/mlperf/mlbox. Accessed 24 July 2020
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Hodak, M., Ellison, D., Dholakia, A. (2021). Benchmarking AI Inference: Where we are in 2020. In: Nambiar, R., Poess, M. (eds) Performance Evaluation and Benchmarking. TPCTC 2020. Lecture Notes in Computer Science(), vol 12752. Springer, Cham. https://doi.org/10.1007/978-3-030-84924-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-84924-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-84923-8
Online ISBN: 978-3-030-84924-5
eBook Packages: Computer ScienceComputer Science (R0)