skip to main content
research-article
Free access

CoCoPIE: enabling real-time AI on off-the-shelf mobile devices via compression-compilation co-design

Published: 24 May 2021 Publication History

Abstract

A new framework allows intelligence on mainstream end devices without special hardware.

References

[1]
Alibaba. 2019.
[2]
Chen, T. et al. TVM: An automated end-toend optimizing compiler for deep learning. In 13th USENIX Symposium on Operating Systems Design and Implementation, 2018, 578--594.
[3]
Chen, Y., Krishna, T., Emer, J., and Sze, V. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. In Proceedings of IEEE Intern. Solid-State Circuits Conf. Digest of Technical Papers, 2016, 262--263.
[4]
Dong, C., Loy, C., He, K., and Tang, X. Learning a deep convolutional network for image super-resolution. In European Conf. Computer Vision. Springer, 2014, 184--199.
[5]
Gatys, L., Ecker, A., and Bethge, M. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conf. Computer Vision and Pattern Recognition, 2016, 2414--2423.
[6]
Google. Tensorflow lite, 2019.
[7]
Google Cloud TPU. Google cloud TPU, 2017; https://cloud.google.com/tpu/
[8]
Guan, H., Shen, X., and Lim, S. Wootz: A compiler-based framework for fast CNN pruning via composability. In Proceedings of the Programming Language Design and Implementation, 2019.
[9]
Han, S. et al. Ese: Efficient speech recognition engine with sparse LSTM on FPGA. FPGA, 2017, 75--84.
[10]
He, Y., Zhang, X., and Sun, J. Channel pruning for accelerating very deep neural networks. In Proceedings of the 2017 IEEE Intern. Conf. on Computer Vision. 2017, 1398--1406.
[11]
Iizuka, S., Simo-Serra, E., and Ishikawa, H. Let there be color! joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Trans. Graphics 3, 4 (July 2016).
[12]
Lebedev, V. and Lempitsky, V. Fast convnets using group-wise brain damage. In Proceedings of the IEEE Conf. Computer Vision and Pattern Recognition, 2016, 2554--2564.
[13]
Li, H., Kadav, A., Durdanovic, I., Samet, H., and Graf, H. Pruning filters for efficient convnets. In Proceedings of the Intern. Conf. on Learning Representations, 2017.
[14]
Lin, T., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollar, P., and Zitnick, L. Microsoft coco: Common objects in context. In Proceedings in European Conf. on Computer Vision. Springer, 2014, 740--755.
[15]
Ma, X. et al. PCONV: The missing but desirable sparsity in DNN weight pruning for real-time execution on mobile devices. AAAI, 2020.
[16]
Mao, H., Han, S., Pool, J., Li, W., Liu, X., Wang, Y., and Dally, W. Exploring the regularity of sparse structure in convolutional neural networks. 2017; arXiv:1705.08922, 2017.
[17]
Nevill-Manning, C. and Witten, I. Identifying hierarchical structure in sequences: A linear-time algorithm. J. Artif. Intell. Res. 7 (1997), 67--82.
[18]
Niu, W., Ma, X., Lin, S., Wang, S., Qian, X., Lin, X., Wang, Y., and Ren, B. PatDNN: Achieving real-time DNN execution on mobile devices with pattern-based weight pruning. ASPLOS, 2020.
[19]
Timofte, R., Agustsson, E., Gool, L., Yang, M., and Zhang, L. Ntire challenge on single image super-resolution: Methods and results. In Proceedings of the IEEE Conf. Computer Vision and Pattern Recognition Workshops, 2017, 114--125.
[20]
Wen, W., Wu, C., Wang, Y., Chen, Y., and Li, H. Learning structured sparsity in deep neural networks. In Advances in Neural Information Processing Systems, 2016, 2074--2082.
[21]
Yu, J., Fan, Y., Yang, J., Xu, N., Wang, Z., Wang, X., and Huang, T. Wide activation for efficient and accurate image super-resolution. 2018; arXiv:1808.08718.
[22]
Zhan, Z. et al. Priv: A privacy-preserving deep neural network model compression framework. arXiv preprint, 2020.
[23]
Zhang, H. and Dana, K. Multi-style generative network for real-time transfer. 2017; arXiv:1703.06953.
[24]
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., and Oliva, A. Learning deep features for scene recognition using places database. In Advances in Neural Information Processing Systems, 2014, 487--495.

Cited By

View all
  • (2025)Fast On-device LLM Inference with NPUsProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707239(445-462)Online publication date: 3-Feb-2025
  • (2024)AICOM-MP: an AI-based monkeypox detector for resource-constrained environmentsConnection Science10.1080/09540091.2024.230696236:1Online publication date: 6-Feb-2024
  • (2023)Compiler Technologies in Deep Learning Co-Design: A SurveyIntelligent Computing10.34133/icomputing.00402Online publication date: 19-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 64, Issue 6
June 2021
106 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/3467845
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 May 2021
Published in CACM Volume 64, Issue 6

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Popular
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)545
  • Downloads (Last 6 weeks)60
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Fast On-device LLM Inference with NPUsProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707239(445-462)Online publication date: 3-Feb-2025
  • (2024)AICOM-MP: an AI-based monkeypox detector for resource-constrained environmentsConnection Science10.1080/09540091.2024.230696236:1Online publication date: 6-Feb-2024
  • (2023)Compiler Technologies in Deep Learning Co-Design: A SurveyIntelligent Computing10.34133/icomputing.00402Online publication date: 19-Jun-2023
  • (2023)Mitigating Query-based Neural Network Fingerprinting via Data AugmentationACM Transactions on Sensor Networks10.1145/3597933Online publication date: 29-May-2023
  • (2023)Efficient Acceleration of Deep Learning Inference on Resource-Constrained Edge Devices: A ReviewProceedings of the IEEE10.1109/JPROC.2022.3226481111:1(42-91)Online publication date: Jan-2023
  • (2022)Layer freezing & data sievingProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601655(19061-19074)Online publication date: 28-Nov-2022
  • (2022)A Framework for Neural Network Architecture and Compile Co-optimizationACM Transactions on Embedded Computing Systems10.1145/353325122:1(1-24)Online publication date: 29-Oct-2022
  • (2022)Rise of the Autonomous MachinesComputer10.1109/MC.2021.309342855:1(64-73)Online publication date: 1-Jan-2022
  • (2022)Rise of the Automotive Health-Domain Controllers: Empowering Healthcare Services in Intelligent VehiclesIEEE Internet of Things Journal10.1109/JIOT.2022.31948889:24(24882-24889)Online publication date: 15-Dec-2022
  • (2021)MESTProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3541855(20838-20850)Online publication date: 6-Dec-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media