skip to main content
10.1145/3489517.3530472acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Hierarchical memory-constrained operator scheduling of neural architecture search networks

Published: 23 August 2022 Publication History

Abstract

Neural Architecture Search (NAS) is widely used in industry, searching for neural networks meeting task requirements. Meanwhile, it faces a challenge in scheduling networks satisfying memory constraints. This paper proposes HMCOS that performs hierarchical memory-constrained operator scheduling of NAS networks: given a network, HMCOS constructs a hierarchical computation graph and employs an iterative scheduling algorithm to progressively reduce peak memory footprints. We evaluate HMCOS against RPO and Serenity (two popular scheduling techniques). The results show that HMCOS outperforms existing techniques in supporting more NAS networks, reducing 8.7~42.4% of peak memory footprints, and achieving 137--283x of speedups in scheduling.

References

[1]
Martín Abadi etal. 2016. TensorFlow: A System for Large-Scale Machine Learning. In OSDI. 265--283.
[2]
Byung Hoon Ahn et al. 2020. Ordering Chaos: Memory-Aware Scheduling of Irregularly Wired Neural Networks for Edge Devices. In MLSys, Vol. 2. 44--57.
[3]
Tianqi Chen et al. 2018. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning. In OSDI. 579--594.
[4]
Yaoyao Ding et al. 2021. IOS: Inter-Operator Scheduler for CNN Acceleration. In MLSys, Vol. 3. 167--180.
[5]
Song Han et al. 2016. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding. In ICLR.
[6]
Kaiming He et al. 2016. Deep Residual Learning for Image Recognition. In CVPR. 770--778.
[7]
Andrew G. Howard et al. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv:1704.04861 [cs.CV]
[8]
Gao Huang et al. 2017. Densely Connected Convolutional Networks. In CVPR. 2261--2269.
[9]
Zhihao Jia et al. 2019. Optimizing DNN Computation with Relaxed Graph Substitutions. In MLSys, Vol. 1. 27--39.
[10]
Marisa Kirisame et al. 2021. Dynamic Tensor Rematerialization. In ICLR.
[11]
Thomas Lengauer et al. 1979. A Fast Algorithm for Finding Dominators in a Flowgraph. ACM Trans. Program. Lang. Syst. 1, 1 (Jan. 1979), 121--141.
[12]
Hanxiao Liu et al. 2019. DARTS: Differentiable Architecture Search. In ICLR.
[13]
Lingxiao Ma et al. 2020. Rammer: Enabling Holistic Deep Learning Compiler Optimizations with rTasks. In OSDI. 881--897.
[14]
Xuan Peng et al. 2020. Capuchin: Tensor-Based GPU Memory Management for Deep Learning. In ASPLOS. 891--905.
[15]
Esteban Real et al. 2019. Regularized Evolution for Image Classifier Architecture Search. In AAAI. 4780--4789.
[16]
Pengzhen Ren et al. 2021. A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions. ACM Comput. Surv. 54, 4, Article 76 (2021).
[17]
Karen Simonyan et al. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In ICLR.
[18]
Saining Xie et al. 2019. Exploring Randomly Wired Neural Networks for Image Recognition. In ICCV. 1284--1293.
[19]
Barret Zoph et al. 2018. Learning Transferable Architectures for Scalable Image Recognition. In CVPR. 8697--8710.

Cited By

View all
  • (2024)MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNNProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651330(607-621)Online publication date: 27-Apr-2024
  • (2024)Latency-Based Inter-Operator Scheduling for CNN Inference Acceleration on GPUIEEE Transactions on Services Computing10.1109/TSC.2023.334595217:1(277-290)Online publication date: Jan-2024
  • (2024)CPM: A Cross-layer Power Management Facility to Enable QoS-Aware AIoT Systems2024 IEEE/ACM 32nd International Symposium on Quality of Service (IWQoS)10.1109/IWQoS61813.2024.10682859(1-10)Online publication date: 19-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference
July 2022
1462 pages
ISBN:9781450391429
DOI:10.1145/3489517
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 August 2022

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

DAC '22
Sponsor:
DAC '22: 59th ACM/IEEE Design Automation Conference
July 10 - 14, 2022
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)106
  • Downloads (Last 6 weeks)7
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)MAGIS: Memory Optimization via Coordinated Graph Transformation and Scheduling for DNNProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3620666.3651330(607-621)Online publication date: 27-Apr-2024
  • (2024)Latency-Based Inter-Operator Scheduling for CNN Inference Acceleration on GPUIEEE Transactions on Services Computing10.1109/TSC.2023.334595217:1(277-290)Online publication date: Jan-2024
  • (2024)CPM: A Cross-layer Power Management Facility to Enable QoS-Aware AIoT Systems2024 IEEE/ACM 32nd International Symposium on Quality of Service (IWQoS)10.1109/IWQoS61813.2024.10682859(1-10)Online publication date: 19-Jun-2024
  • (2024)A Tale of Two Domains: Exploring Efficient Architecture Design for Truly Autonomous Things2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00022(167-181)Online publication date: 29-Jun-2024
  • (2024)OPASS: Orchestrating TVM's Passes for Lowering Memory Footprints of Computation Graphs2024 IEEE International Conference on Software Maintenance and Evolution (ICSME)10.1109/ICSME58944.2024.00026(175-186)Online publication date: 6-Oct-2024
  • (2023)MetaE2RL: Toward Meta-Reasoning for Energy-Efficient Multigoal Reinforcement Learning With Squeezed-Edge You Only Look OnceIEEE Micro10.1109/MM.2023.331820043:6(29-39)Online publication date: 25-Sep-2023
  • (2023)Memory-aware Scheduling for Complex Wired Networks with Iterative Graph Optimization2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)10.1109/ICCAD57390.2023.10323998(1-9)Online publication date: 28-Oct-2023
  • (2023)Minimizing Peak Memory Footprint of Inference on IoTs Devices by Efficient RecomputationAdvanced Intelligent Computing Technology and Applications10.1007/978-981-99-4761-4_2(15-26)Online publication date: 10-Aug-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media