skip to main content
10.1145/3627703.3650067acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article
Free Access

NeuroFlux: Memory-Efficient CNN Training Using Adaptive Local Learning

Published:22 April 2024Publication History

ABSTRACT

Efficient on-device Convolutional Neural Network (CNN) training in resource-constrained mobile and edge environments is an open challenge. Backpropagation is the standard approach adopted, but it is GPU memory intensive due to its strong inter-layer dependencies that demand intermediate activations across the entire CNN model to be retained in GPU memory. This necessitates smaller batch sizes to make training possible within the available GPU memory budget, but in turn, results in substantially high and impractical training time. We introduce NeuroFlux, a novel CNN training system tailored for memory-constrained scenarios. We develop two novel opportunities: firstly, adaptive auxiliary networks that employ a variable number of filters to reduce GPU memory usage, and secondly, block-specific adaptive batch sizes, which not only cater to the GPU memory constraints but also accelerate the training process. NeuroFlux segments a CNN into blocks based on GPU memory usage and further attaches an auxiliary network to each layer in these blocks. This disrupts the typical layer dependencies under a new training paradigm - 'adaptive local learning'. Moreover, NeuroFlux adeptly caches intermediate activations, eliminating redundant forward passes over previously trained blocks, further accelerating the training process. The results are twofold when compared to Backpropagation: on various hardware platforms, NeuroFlux demonstrates training speed-ups of 2.3× to 6.1× under stringent GPU memory budgets, and NeuroFlux generates streamlined models that have 10.9× to 29.4× fewer parameters.

References

  1. Babak Joze Abbaschian, Daniel Sierra-Sosa, and Adel Said Elmaghraby. 2021. Deep Learning Techniques for Speech Emotion Recognition, from Databases to Models. Sensors.Google ScholarGoogle Scholar
  2. Ahmed M. Abdelmoniem, Atal Narayan Sahu, Marco Canini, and Suhaib A. Fahmy. 2023. REFL: Resource-Efficient Federated Learning. In European Conference on Computer Systems.Google ScholarGoogle Scholar
  3. Samson Akinpelu, Serestina Viriri, and Adekanmi Adegun. 2023. Lightweight Deep Learning Framework for Speech Emotion Recognition. IEEE Access.Google ScholarGoogle Scholar
  4. Milad Alizadeh, Shyam A. Tailor, Luisa M Zintgraf, Joost van Amersfoort, Sebastian Farquhar, Nicholas Donald Lane, and Yarin Gal. 2022. Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients. In International Conference on Machine Learning.Google ScholarGoogle Scholar
  5. Eugene Belilovsky, Michael Eickenberg, and Edouard Oyallon. 2019. Greedy Layerwise Learning Can Scale To ImageNet. In International Conference on Machine Learning.Google ScholarGoogle Scholar
  6. Eugene Belilovsky, Michael Eickenberg, and Edouard Oyallon. 2020. Decoupled Greedy Learning of CNNs. In International Conference on Machine Learning.Google ScholarGoogle Scholar
  7. Léon Bottou, Frank E. Curtis, and Jorge Nocedal. 2018. Optimization Methods for Large-Scale Machine Learning. SIAM Rev.Google ScholarGoogle Scholar
  8. Andrew Brock, Theodore Lim, J. M. Ritchie, and Nick Weston. 2017. FreezeOut: Accelerate Training by Progressively Freezing Layers. arXiv:abs/1706.04983.Google ScholarGoogle Scholar
  9. Han Cai, Ligeng Zhu, and Song Han. 2019. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  10. Miguel A. Carreira-Perpinan and Yerlan Idelbayev. 2018. "Learning-Compression" Algorithms for Neural Net Pruning. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  11. Tianqi Chen, Bing Xu, Chiyuan Zhang, and Carlos Guestrin. 2016. Training Deep Nets with Sublinear Memory Cost. arXiv:abs/1604.06174.Google ScholarGoogle Scholar
  12. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  13. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.Google ScholarGoogle Scholar
  14. Bailey J. Eccles, Philip Rodgers, Peter Kilpatrick, Ivor Spence, and Blesson Varghese. 2024. DNNShifter: An Efficient DNN Pruning System for Edge Computing. Future Generation Computer Systems.Google ScholarGoogle Scholar
  15. Jonathan Frankle and Michael Carbin. 2019. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  16. In Gim and JeongGil Ko. 2022. Memory-efficient DNN training on mobile devices. In International Conference on Mobile Systems, Applications and Services (MobiSys '22).Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Junyao Guo, Unmesh Kurup, and Mohak Shah. 2021. Efficacy of Model Fine-Tuning for Personalized Dynamic Gesture Recognition. In Deep Learning for Human Activity Recognition.Google ScholarGoogle Scholar
  18. Yunhui Guo, Honghui Shi, Abhishek Kumar, Kristen Grauman, Tajana Rosing, and Rogerio Feris. 2019. SpotTune: Transfer Learning Through Adaptive Fine-Tuning. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  19. Yiwen Guo, Anbang Yao, and Yurong Chen. 2016. Dynamic Network Surgery for Efficient DNNs. In International Conference on Neural Information Processing Systems.Google ScholarGoogle Scholar
  20. Otkrist Gupta and Ramesh Raskar. 2018. Distributed Learning of Deep Neural Network over Multiple Agents. Journal of Network and Computer Applications.Google ScholarGoogle ScholarCross RefCross Ref
  21. Amirhossein Habibian, Davide Abati, Taco Cohen, and Babak Ehteshami Bejnordi. 2021. Skip-Convolutions for Efficient Video Processing. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  22. Dong-Jun Han, Do-Yeon Kim, Minseok Choi, Christopher G. Brinton, and Jaekyun Moon. 2022. SplitGP: Achieving Both Generalization and Personalization in Federated Learning. IEEE Conference on Computer Communications.Google ScholarGoogle Scholar
  23. Song Han, Huizi Mao, and William J. Dally. 2016. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  24. Chaoyang He, Shen Li, Mahdi Soltanolkotabi, and Salman Avestimehr. 2021. PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. In International Conference on Machine Learning.Google ScholarGoogle Scholar
  25. Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2020. Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  26. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  27. Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv:abs/1704.04861.Google ScholarGoogle Scholar
  28. Baojin Huang, Zhongyuan Wang, Guangcheng Wang, Kui Jiang, Zheng He, Hua Zou, and Qin Zou. 2021. Masked Face Recognition Datasets and Validation. In 2021 IEEE/CVF International Conference on Computer Vision Workshops.Google ScholarGoogle Scholar
  29. Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, and Kilian Weinberger. 2018. Multi-Scale Dense Networks for Resource Efficient Image Classification. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  30. Yanping Huang, Youlong Cheng, Ankur Bapna, Orhan Firat, Mia Xu Chen, Dehao Chen, HyoukJoong Lee, Jiquan Ngiam, Quoc V. Le, Yonghui Wu, and Zhifeng Chen. 2019. GPipe: Efficient Training of Giant Neural Networks Using Pipeline Parallelism. International Conference on Neural Information Processing Systems.Google ScholarGoogle Scholar
  31. Sinh Huynh, Rajesh Balan, and Jeonggil Ko. 2021. iMon: Appearance-based Gaze Tracking System on Mobile Devices. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Sergey Ioffe. 2017. Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models. In Advances in Neural Information Processing Systems.Google ScholarGoogle Scholar
  33. Joseph Bailey Luttrell Iv, Zhaoxian Zhou, Chaoyang Zhang, Ping Gong, and Yuanyuan Zhang. 2017. Facial Recognition via Transfer Learning: Fine-Tuning Keras_vggface. International Conference on Computational Science and Computational Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  34. Yigitcan Kaya, Sanghyun Hong, and Tudor Dumitras. 2018. Shallow-Deep Networks: Understanding and Mitigating Network Overthinking. In International Conference on Machine Learning.Google ScholarGoogle Scholar
  35. Ira Kemelmacher-Shlizerman, Steven M Seitz, Daniel Miller, and Evan Brossard. 2016. The MegaFace Benchmark: 1 Million Faces for Recognition at Scale. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle ScholarCross RefCross Ref
  36. Nitish Shirish Keskar, Dheevatsa Mudigere, Jorge Nocedal, Mikhail Smelyanskiy, and Ping Tak Peter Tang. 2017. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  37. Adam Kohan, Edward A. Rietman, and Hava T. Siegelmann. 2023. Signal Propagation: The Framework for Learning and Inference in a Forward Pass. IEEE Transactions on Neural Networks and Learning Systems.Google ScholarGoogle Scholar
  38. Alexandros Kouris and Christos-Savvas Bouganis. 2018. Learning to Fly by MySelf: A Self-Supervised CNN-Based Approach for Autonomous Navigation. In IEEE/RSJ International Conference on Intelligent Robots and Systems.Google ScholarGoogle Scholar
  39. Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images. Technical Report. https://www.cs.toronto.edu/ kriz/cifar.html.Google ScholarGoogle Scholar
  40. Stefanos Laskaridis, Stylianos I. Venieris, Hyeji Kim, and Nicholas D. Lane. 2020. HAPI: Hardware-Aware Progressive Inference. In International Conference on Computer-Aided Design.Google ScholarGoogle Scholar
  41. Ya Le and Xuan S. Yang. 2015. Tiny ImageNet Visual Recognition Challenge. http://vision.stanford.edu/teaching/cs231n/reports/2015/pdfs/yle_project.pdfGoogle ScholarGoogle Scholar
  42. Juhyun Lee, Nikolay Chirkov, Ekaterina Ignasheva, Yury Pisarchyk, Mogan Shieh, Fabio Riccardi, Raman Sarokin, Andrei Kulik, and Matthias Grundmann. 2019. On-Device Neural Net Inference with Mobile GPUs. arXiv:abs/1907.01989.Google ScholarGoogle Scholar
  43. H. Li, H. Zhang, X. Qi, Y. Ruigang, and G. Huang. 2019. Improved Techniques for Training Adaptive Deep Networks. In IEEE/CVF International Conference on Computer Vision.Google ScholarGoogle Scholar
  44. Qianli Liao, Joel Z. Leibo, and Tomaso Poggio. 2016. How Important is Weight Symmetry in Backpropagation?. In AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  45. Timothy P. Lillicrap, Daniel Cownden, Douglas Blair Tweed, and Colin J. Akerman. 2016. Random Synaptic Feedback Weights Support Error Backpropagation for Deep Learning. Nature Communications.Google ScholarGoogle Scholar
  46. Tao Lin, Sebastian U. Stich, Luis Barba, Daniil Dmitriev, and Martin Jaggi. 2020. Dynamic Model Pruning with Feedback. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  47. Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2019. DARTS: Differentiable Architecture Search. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  48. Joseph Luttrell, Zhaoxian Zhou, Yuanyuan Zhang, Chaoyang Zhang, Ping Gong, Bei Yang, and Runzhi Li. 2018. A Deep Transfer Learning Approach to Fine-Tuning Facial Recognition Models. In IEEE Conference on Industrial Electronics and Applications.Google ScholarGoogle ScholarCross RefCross Ref
  49. Bishwas Mandal, Adaeze Okeukwu, and Yihong Theis. 2021. Masked Face Recognition using ResNet-50. arXiv:abs/2104.08997.Google ScholarGoogle Scholar
  50. Dominic Masters and Carlo Luschi. 2018. Revisiting Small Batch Training for Deep Neural Networks. arXiv:abs/1804.07612.Google ScholarGoogle Scholar
  51. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. In International Conference on Artificial Intelligence and Statistics.Google ScholarGoogle Scholar
  52. Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. 2017. Pruning Convolutional Neural Networks for Resource Efficient Inference. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  53. Hesham Mostafa and Xin Wang. 2019. Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization. In International Conference on Machine Learning.Google ScholarGoogle Scholar
  54. Sarala Padi, Seyed Omid Sadjadi, Dinesh Manocha, and Ram D. Sriram. 2021. Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation. Proceedings of the International Conference on Multimodal Interaction.Google ScholarGoogle Scholar
  55. German I. Parisi, Ronald Kemker, Jose L. Part, Christopher Kanan, and Stefan Wermter. 2019. Continual Lifelong Learning with Neural Networks: A Review. Neural Networks.Google ScholarGoogle Scholar
  56. HyeonJung Park, Youngki Lee, and JeongGil Ko. 2021. Enabling Realtime Sign Language Translation on Mobile Platforms with On-board Depth Cameras. In Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies.Google ScholarGoogle Scholar
  57. David Patterson, Joseph Gonzalez, Urs Hölzle, Quoc Le, Chen Liang, Lluis-Miquel Munguia, Daniel Rothchild, David R. So, Maud Texier, and Jeff Dean. 2022. The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink. Computer (2022).Google ScholarGoogle Scholar
  58. Hieu Pham, Melody Guan, Barret Zoph, Quoc Le, and Jeff Dean. 2018. Efficient Neural Architecture Search via Parameters Sharing. In International Conference on Machine Learning.Google ScholarGoogle Scholar
  59. Jaya Prakash Sahoo, Allam Jaya Prakash, Paweł Pławiak, and Saunak Samantray. 2022. Real-Time Hand Gesture Recognition Using Fine-Tuned Convolutional Neural Network. Sensors.Google ScholarGoogle Scholar
  60. F. Sarfraz, E. Arani, and B. Zonooz. 2021. Knowledge Distillation Beyond Model Compression. In International Conference on Pattern Recognition.Google ScholarGoogle Scholar
  61. Roy Schwartz, Jesse Dodge, Noah A. Smith, and Oren Etzioni. 2020. Green AI. Commun. ACM (2020).Google ScholarGoogle Scholar
  62. Shaohuai Shi, Qiang Wang, and Xiaowen Chu. 2020. Efficient Sparse-Dense Matrix-Matrix Multiplication on GPUs Using the Customized Sparse Storage Format. In IEEE International Conference on Parallel and Distributed Systems.Google ScholarGoogle Scholar
  63. Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  64. Hidenori Tanaka, Daniel Kunin, Daniel L Yamins, and Surya Ganguli. 2020. Pruning neural networks without any data by iteratively conserving synaptic flow. In Advances in Neural Information Processing Systems.Google ScholarGoogle Scholar
  65. Surat Teerapittayanon, Bradley McDanel, and H.T. Kung. 2016. BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks. In International Conference on Pattern Recognition.Google ScholarGoogle Scholar
  66. Praneeth Vepakomma, Otkrist Gupta, Tristan Swedish, and Ramesh Raskar. 2018. Split Learning for Health: Distributed Deep Learning Without Sharing raw patient data. arXiv:abs/1812.00564.Google ScholarGoogle Scholar
  67. Haibin Wang, Ce Ge, Hesen Chen, and Xiuyu Sun. 2023. PreNAS: Preferred One-Shot Learning Towards Efficient Neural Architecture Search. In International Conference on Machine Learning.Google ScholarGoogle Scholar
  68. Siqi Wang, Anuj Pathania, and Tulika Mitra. 2020. Neural Network Inference on Mobile SoCs. IEEE Design & Test.Google ScholarGoogle Scholar
  69. Yiding Wang, Decang Sun, Kai Chen, Fan Lai, and Mosharaf Chowdhury. 2022. Egeria: Efficient DNN Training with Knowledge-Guided Layer Freezing. European Conference on Computer Systems.Google ScholarGoogle Scholar
  70. Zhiyuan Wang, Hongli Xu, Yang Xu, Zhida Jiang, and Jianchun Liu. 2023. CoopFL: Accelerating Federated Learning with DNN Partitioning and Offloading in Heterogeneous Edge Computing. Comput. Netw.Google ScholarGoogle Scholar
  71. Bichen Wu, Forrest Iandola, Peter H. Jin, and Kurt Keutzer. 2017. SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.Google ScholarGoogle ScholarCross RefCross Ref
  72. Di Wu, Rehmat Ullah, Paul Harvey, Peter Kilpatrick, Ivor Spence, and Blesson Varghese. 2022. FedAdapt: Adaptive Offloading for IoT Devices in Federated Learning. IEEE Internet of Things Journal.Google ScholarGoogle Scholar
  73. Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, and Jian Cheng. 2016. Quantized Convolutional Neural Networks for Mobile Devices. In IEEE Conference on Computer Vision and Pattern Recognition.Google ScholarGoogle Scholar
  74. Fang Yu, Li Cui, Pengcheng Wang, Chuanqi Han, Ruoran Huang, and Xi Huang. 2021. EasiEdge: A Novel Global Deep Neural Networks Pruning Method for Efficient Edge Computing. IEEE Internet of Things Journal (2021).Google ScholarGoogle Scholar
  75. Ruizhe Zhao and Wayne W. C. Luk. 2018. Efficient Structured Pruning and Architecture Searching for Group Convolution. 2019 IEEE/CVF International Conference on Computer Vision Workshop.Google ScholarGoogle Scholar
  76. Barret Zoph and Quoc Le. 2017. Neural Architecture Search with Reinforcement Learning. In International Conference on Learning Representations.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    EuroSys '24: Proceedings of the Nineteenth European Conference on Computer Systems
    April 2024
    1245 pages
    ISBN:9798400704376
    DOI:10.1145/3627703

    Copyright © 2024 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 22 April 2024

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate241of1,308submissions,18%
  • Article Metrics

    • Downloads (Last 12 months)72
    • Downloads (Last 6 weeks)72

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader