skip to main content
10.1145/3569951.3603632acmconferencesArticle/Chapter ViewAbstractPublication PagespearcConference Proceedingsconference-collections
short-paper
Open access

Porting AI/ML Models to Intelligence Processing Units (IPUs)

Published: 10 September 2023 Publication History

Abstract

Intelligence processing units (IPUs) are specifically designed accelerators that are dedicated to support artificial intelligence (AI) and machine learning (ML) workflows. Here, we report on the performance characteristics and code-porting experiences on Graphcore IPUs offered on the new National Science Foundation (NSF)-funded Accelerating Computing for Emerging Sciences (ACES) testbed. Our benchmarks compared performance of AI/ML frameworks on ACES IPUS to similar runs on the Graphcloud environment, a commercial IPU cloud service offered by Graphcore. We also ported two PyTorch neural network models from Graphics Processing Units (GPUs) to IPUs to ensure the efficacy of the software environment. The ported models include the TransCycleGAN model that is used in reconstructing high-resolution images from low-resolution images, and the Hierarchical Autoencoder that is for large-scale high-resolution scientific data compression in climate models. These models were successfully ported on mulitple IPUs using utilities in the Graphcore Poplar software development kit. Increasing the number of IPUs resulted in a considerable enhancement in the model's throughput.

References

[1]
NSF ACES | Texas A&M High Performance Research Computing. Retrieved June 2, 2023 from https://hprc.tamu.edu/aces/
[2]
Abhinand Nasari, Hieu Le, Richard Lawrence, Zhenhua He, Xin Yang, Mario Krell, Alex Tsyplikhin, 2022. Benchmarking the Performance of Accelerators on National Cyberinfrastructure Resources for Artificial Intelli- gence/Machine Learning Workloads. Practice and Experience in Advanced Research Computing (2022), 1–9.https://doi.org/10.1145/3491418.3530772
[3]
Graphcore IPU hardware overview. Retrieved June 2, 2023 from https://docs.graphcore.ai/projects/ipu-overview/en/latest/about_ipu.html
[4]
Compression Vector-quantized Variational Autoencoder. Retrieved June 2, 2023 from https://github.com/abhinand5ai/tccs_torch
[5]
Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep Learning Face Attributes in the Wild. In Proceedings of International Conference on Computer Vision (ICCV).
[6]
Graphcore IPU example GitHub repository. Retrieved June 2, 2023 from https://github.com/graphcore/examples.
[7]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. arXiv:1409.0575 [cs.CV]
[8]
Shuo Yang, Ping Luo, Chen Change Loy, and Xiaoou Tang. 2016. WIDER FACE: A Face Detection Benchmark. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[9]
Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. 2022. Restormer: Efficient Transformer for High-Resolution Image Restoration. arXiv:2111.09881 [cs.CV] International conference on curves and surfaces. Springer, 711–730.
[10]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2020. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. arXiv:1703.10593 [cs.CV]
[11]
SDRBench. Retrieved June 2, 2023 from https://sdrbench.github.io
[12]
K. Zhao, S. Di, X. Liang, S. Li, D. Tao, J. Bessac, Z. Chen, and F. Cappello, “SDRBench: Scientific Data Reduction Benchmark for Lossy Compressors”, International Workshop on Big Data Reduction (IWBDR2020), in conjunction with IEEE Bigdata20.
[13]
iHESP Archives. Retrieve June 2, 2023 from  https://ihesp.github.io/archive
[14]
Ping Chang, Shaoqing Zhang, Gokhan Danabasoglu, Stephen G. Yeager, Haohuan Fu, Hong Wang, Frederic S. Castruccio, Yuhu Chen, James Edwards, Dan Fu, Yinglai Jia, Lucas C. Laurindo, Xue Liu, Nan Rosenbloom, R. Justin Small, Gaopeng Xu, Yunhui Zeng, Qiuying Zhang, Julio Bacmeister, David A. Bailey, Xiaohui Duan, Alice K. DuVivier, Dapeng Li, Yuxuan Li, Richard Neale, Achim Stössel, Li Wang, Yuan Zhuang, Allison Baker, Susan Bates, John Dennis, Xiliang Diao, Bolan Gan, Abishek Gopal, Dongning Jia, Zhao Jing, Xiaohui Ma, R. Saravanan, Warren G. Strand, Jian Tao, Haiyuan Yang, Xiaoqi Wang, Zhiqiang Wei, and Lixin Wu. 2020. An Unprecedented Set of High-Resolution Earth System Simulations for Understanding Multiscale Interactions in Climate Variability and Change. Journal of Advances in Modeling Earth Systems 12, 12 (2020), e2020MS002298. https://doi.org/10.1029/2020MS002298 arXiv:https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2020MS002298 e2020MS002298 2020MS002298
[15]
David A Huffman. 1952. A method for the construction of minimum-redundancy codes. Proceedings of the IRE 40, 9 (1952), 1098–1101.
[16]
D.E. Rumelhart, G.E. Hinton, and R.J. Williams, "Learning internal representations by error propagation.", Parallel Distributed Processing. Vol 1: Foundations. MIT Press, Cambridge, MA, 1986.
[17]
Ballard, "Modular learning in neural networks," Proceedings AAAI (1987)
[18]
LeCun, Y. (1987). Modèles connexionistes de l'apprentissage. Ph.D. thesis, Université de Paris VI. 17, 499, 511
[19]
Doyub Kim, Minjae Lee, and Ken Museth. 2022. NeuralVDB: High-resolutionSparse Volume Representation using Hierarchical Neural Networks. arXiv preprint arXiv:2208.04448 (2022).
[20]
Ali Razavi, Aaron Van den Oord, and Oriol Vinyals. 2019. Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems 32 (2019).
[21]
Hieu Le, Hernan Santos, and Jian Tao. 2023. Hierarchical Autoencoder-based Lossy Compression for Large-scale High-resolution Scientific Data. Manuscript submitted for publication.
[22]
Texas AM High Performance Research Computing. Accessed on April 21, 2023. High Performance Research Computing Training. https://hprc.tamu.edu/files/training/2022/Fall/IPU_Training_Labs_Fall2022.pdf.
[23]
Zhenhua He, Sandra Nite, Abhinand, Hieu Le, Jian Tao, Dhruva Chakravorty, Lisa Perez, Honggao Liu. 2023. Development of a Training Framework for Novel Accelerators. Manuscript submitted for publication.
[24]
Christian Ledig, Lucas Theis, Ferenc Husźar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4681–4690, 2017
[25]
Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. Enhanced deep residual networks for single image super-resolution, 2017.
[26]
Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu.Image super-resolution using very deep residual channel attention networks. In Proceedings of the European conference on computer vision (ECCV), pages 286–301, 2018
[27]
Xiangyu Chen, Xintao Wang, Jiantao Zhou, and Chao Dong. Activating more pixels in image super-resolution transformer.arXiv preprint arXiv:2205.04437, 2022
[28]
NSF - Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support. Retrieved June 2, 2023 from https://access-ci.org/
[29]
Zhenhua He, Aditi Saluja, Richard Lawrence, Dhruva K. Chakravorty, Francis Dang, Lisa M. Perez, and Honggao Liu. 2023 (Accepted). Performance of Distributed Deep Learning Workloads on a Composable Cyberinfrastructure. In Practice and Experience in Advanced Research Computing (PEARC ‘23), Portland, OR, USA. ACM, New York, NY, USA, 12 pages.
[30]
Edmundo Medina-Gurrola, Dhruva K. Chakravorty, Diana V. Dugas, Tim Cockerill, Lisa M. Perez, Emily Hunt. (2022). Regional Collaborations Supporting Cyberinfrastructure-Enabled Research During a Pandemic: The Structure and Support Plan of the SWEETER CyberTeam. In Practice and Experience in Advanced Research Computing (PEARC ‘22), 4 pages. https://doi.org/10.1145/3491418.3535186
[31]
Richard Lawrence, Dhruva K. Chakravorty, Francis Dang, Lisa M. Perez, Wesley Brashear, Zhenhua He, and Honggao Liu. 2023 (Accepted). Developing Synthetic Applications Benchmarks on Composable Cyberinfrastructure: A Study of Scaling Molecular Dynamics Applications on GPUs. In Practice and Experience in Advanced Research Computing (PEARC ‘23), Portland, OR, USA. ACM, New York, NY, USA, 6 pages.

Cited By

View all
  • (2024)Performance of Molecular Dynamics Acceleration Strategies on Composable CyberinfrastructurePractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670631(1-5)Online publication date: 17-Jul-2024
  • (2024)Exploring the Viability of Composable Architectures to Overcome Memory Limitations in High Performance Computing WorkflowsPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670620(1-4)Online publication date: 17-Jul-2024
  • (2024)Container Adoption in Campus High Performance Computing at Texas A&M UniversityPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670550(1-7)Online publication date: 17-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PEARC '23: Practice and Experience in Advanced Research Computing 2023: Computing for the Common Good
July 2023
519 pages
ISBN:9781450399852
DOI:10.1145/3569951
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 September 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Accelerating Computing for Emerging Sciences (ACES)
  2. Data Compression
  3. Image Super-Resolution
  4. Intelligence Processing Unit (IPU)
  5. ResNet50

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Funding Sources

Conference

PEARC '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 133 of 202 submissions, 66%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)253
  • Downloads (Last 6 weeks)38
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Performance of Molecular Dynamics Acceleration Strategies on Composable CyberinfrastructurePractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670631(1-5)Online publication date: 17-Jul-2024
  • (2024)Exploring the Viability of Composable Architectures to Overcome Memory Limitations in High Performance Computing WorkflowsPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670620(1-4)Online publication date: 17-Jul-2024
  • (2024)Container Adoption in Campus High Performance Computing at Texas A&M UniversityPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670550(1-7)Online publication date: 17-Jul-2024
  • (2024)Cultivating Cyberinfrastructure Careers through Student Engagement at Texas A&M University High Performance Research ComputingPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670544(1-6)Online publication date: 17-Jul-2024
  • (2024)Impact of Memory Bandwidth on the Performance of AcceleratorsPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670540(1-9)Online publication date: 17-Jul-2024
  • (2024)Insight Gained from Migrating a Machine Learning Model to Intelligence Processing UnitsPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670527(1-9)Online publication date: 17-Jul-2024
  • (2023)Developing Synthetic Applications Benchmarks on Composable Cyberinfrastructure: A Study of Scaling Molecular Dynamics Applications on GPUsPractice and Experience in Advanced Research Computing 2023: Computing for the Common Good10.1145/3569951.3597556(216-220)Online publication date: 23-Jul-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media