Falcon: lightweight and accurate convolution based on depthwise separable convolution

Jang, Jun-Gi; Quan, Chun; Lee, Hyun Dong; Kang, U.

doi:10.1007/s10115-022-01818-x

Falcon: lightweight and accurate convolution based on depthwise separable convolution

Regular Paper
Published: 17 January 2023

Volume 65, pages 2225–2249, (2023)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Jun-Gi Jang¹^na1,
Chun Quan²^na1,
Hyun Dong Lee³ &
…
U. Kang ORCID: orcid.org/0000-0002-8774-6950¹

1219 Accesses
1 Altmetric
Explore all metrics

Abstract

How can we efficiently compress convolutional neural network (CNN) using depthwise separable convolution, while retaining their accuracy on classification tasks? Depthwise separable convolution, which replaces a standard convolution with a depthwise convolution and a pointwise convolution, has been used for building lightweight architectures. However, previous works based on depthwise separable convolution are limited when compressing a trained CNN model since (1) they are mostly heuristic approaches without a precise understanding of their relations to standard convolution, and (2) their accuracies do not match that of the standard convolution. In this paper, we propose Falcon, an accurate and lightweight method to compress CNN based on depthwise separable convolution.Falcon uses generalized elementwise product (GEP), our proposed mathematical formulation to approximate the standard convolution kernel, to interpret existing convolution methods based on depthwise separable convolution. By exploiting the knowledge of a trained standard model and carefully determining the order of depthwise separable convolution via GEP, Falcon achieves sufficient accuracy close to that of the trained standard model. Furthermore, this interpretation leads to developing a generalized version rank-k Falcon which performs k independent Falcon operations and sums up the result. Experiments show that Falcon (1) provides higher accuracy than existing methods based on depthwise separable convolution and tensor decomposition and (2) reduces the number of parameters and FLOPs of standard convolution by up to a factor of 8 while ensuring similar accuracy. We also demonstrate that rank-k Falcon further improves the accuracy while sacrificing a bit of compression and computation reduction rates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast and Robust Compression of Deep Convolutional Neural Networks

A Lightweight Neural Network Combining Dilated Convolution and Depthwise Separable Convolution

Convolutional Neural Network Compression via Tensor-Train Decomposition on Permuted Weight Tensor with Automatic Rank Determination

Data availability

The code of Falcon is at https://github.com/snudm-starlab/FALCON2. All relevant data are within the manuscript. CIFAR10, and CIFAR100 datasets are available from https://www.cs.toronto.edu/~kriz/cifar.html. ImageNet dataset is available from http://www.image-net.org/. SVHN dataset is available from http://ufldl.stanford.edu/housenumbers/.

References

Agustsson E, Mentzer F, Tschannen M, Cavigelli L, Timofte R, Benini L, Gool LV (2017) Soft-to-hard vector quantization for end-to-end learning compressible representations. In: NeurIPS, pp. 1141–1151
Chen W, Wilson JT, Tyree S, Weinberger KQ, Chen Y (2015) Compressing neural networks with the hashing trick. In: ICML, pp. 2285–2294
Chen W, Wilson JT, Tyree S, Weinberger KQ, Chen, Y (2016) Compressing convolutional neural networks in the frequency domain. In: SIGKDD, pp. 1475–1484
Cho I, Kang U (2022) Pea-kd: parameter-efficient and accurate knowledge distillation on bert. PLoS ONE 17(2):e0263592
Article Google Scholar
Choi Y, El-Khamy M, Lee J (2017) Towards the limit of network quantization. In: ICLR
Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: CVPR, pp. 1800–1807
Courbariaux M, Bengio Y, David J (2015) Binaryconnect: training deep neural networks with binary weights during propagations. In: NeurIPS, pp. 3123–3131
Frankle J, Carbin M (2019) The lottery ticket hypothesis: finding sparse, trainable neural networks. In: ICLR. OpenReview.net
Gao H, Wang Z, Ji S (2018) Channelnets: Compact and efficient convolutional neural networks via channel-wise convolutions. In: Advances in neural information processing systems 31: annual conference on neural information processing systems 2018, NeurIPS 2018, December 3–8, 2018, Montréal, Canada, pp. 5203–5211
Gou J, Yu B, Maybank SJ, Tao D (2021) Knowledge distillation: a survey. Int J Comput Vision 129(6):1789–1819
Article Google Scholar
Guo J, Li Y, Lin W, Chen Y, Li J (2018) Network decoupling: From regular to depthwise separable convolutions. In: British machine vision conference 2018, BMVC 2018, Newcastle, UK, September 3–6, 2018, p. 248. BMVA Press
Guo J, Ouyang W, Xu D (2020) Multi-dimensional pruning: a unified framework for model compression. In: CVPR, pp. 1505–1514. IEEE
Han S, Mao H, Dally WJ (2016) Deep compression: compressing deep neural network with pruning, trained quantization and huffman coding. In: ICLR
Han S, Pool J, Tran J, Dally WJ (2015) Learning both weights and connections for efficient neural network. In: NeurIPS, pp. 1135–1143
Hinton G, Vinyals O, Dean J, et al (2015) Distilling the knowledge in a neural network. arXiv:1503.025312(7)
Hou L, Yao Q, Kwok JT (2017) Loss-aware binarization of deep networks. In: ICLR
Howard A, Pang R, Adam H, Le QV, Sandler M, Chen B, Wang W, Chen L, Tan M, Chu G, Vasudevan V, Zhu Y (2019) Searching for mobilenetv3. In: ICCV, pp. 1314–1324. IEEE
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR arXiv:1704.04861
Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2016) Binarized neural networks. In: NeurIPS, pp. 4107–4115
Kim DH, Park C, Oh J, Lee S, Yu H (2016) Convolutional matrix factorization for document context-aware recommendation. In: Recsys, pp. 233–240
Kim J, Jung J, Kang U (2021) Compressing deep graph convolution network with multi-staged knowledge distillation. PLoS ONE 16(8):e0256187
Article Google Scholar
Kim Y, Park E, Yoo S, Choi T, Yang L, Shin D (2016) Compression of deep convolutional neural networks for fast and low power mobile applications. In: ICLR
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NeurIPS, pp. 1106–1114
Lebedev V, Ganin Y, Rakhuba M, Oseledets IV, Lempitsky VS (2015) Speeding-up convolutional neural networks using fine-tuned cp-decomposition. In: ICLR
Lee HD, Lee S, Kang U (2021) Auber: automated bert regularization. PLoS ONE 16(6):1–16. https://doi.org/10.1371/journal.pone.0253241
Article Google Scholar
Li H, Kadav A, Durdanovic I, Samet H, Graf HP (2017) Pruning filters for efficient convnets. In: ICLR
Li T, Wu B, Yang Y, Fan Y, Zhang Y, Liu W (2019) Compressing convolutional neural networks via factorized convolutional filters. In: CVPR, pp. 3977–3986. Computer Vision Foundation/IEEE
Ma N, Zhang X, Zheng H, Sun J (2018) Shufflenet V2: practical guidelines for efficient CNN architecture design. In: ECCV, pp. 122–138
Molchanov P, Tyree S, Karras T, Aila T, Kautz J (2017) Pruning convolutional neural networks for resource efficient inference. In: ICLR
Nakajima S, Tomioka R, Sugiyama M, Babacan SD (2012) Perfect dimensionality recovery by variational bayesian PCA. In: NeurIPS, pp. 980–988
Niu W, Ma X, Lin S, Wang S, Qian X, Lin X, Wang Y, Ren B (2020) Patdnn: Achieving real-time DNN execution on mobile devices with pattern-based weight pruning. In: ASPLOS, pp. 907–922. ACM
Novikov A, Podoprikhin D, Osokin A, Vetrov DP (2015) Tensorizing neural networks. In: NeurIPS, pp. 442–450
Piao T, Cho I, Kang U (2022) Sensimix: sensitivity-aware 8-bit index & 1-bit value mixed precision quantization for bert compression. PLoS ONE 17(4):e0265621
Article Google Scholar
Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y (2014) Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550
Sandler M, Howard AG, Zhu M, Zhmoginov A, Chen L (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: CVPR, pp. 4510–4520
Sifre L (2014) Rigid-motion scattering for image classification. Ph.D. thesis, École Polytechnique
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: AAAI, 2017, pp. 4278–4284
Tung F, Mori G (2019) Similarity-preserving knowledge distillation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 1365–1374
Ullrich K, Meeds E, Welling M (2017) Soft weight-sharing for neural network compression. In: ICLR
Wan A, Dai X, Zhang P, He Z, Tian Y, Xie S, Wu B, Yu M, Xu T, Chen K, Vajda P, Gonzalez JE (2020) Fbnetv2: differentiable neural architecture search for spatial and channel dimensions. In: CVPR, pp. 12962–12971. IEEE
Wang Y, Xu C, Qiu J, Xu C, Tao D (2018) Towards evolutionary compression. In: SIGKDD, pp. 2476–2485
Wu B, Dai X, Zhang P, Wang Y, Sun F, Wu Y, Tian Y, Vajda P, Jia Y, Keutzer K (2019) Fbnet: Hardware-aware efficient convnet design via differentiable neural architecture search. In: CVPR, pp. 10734–10742. Computer Vision Foundation/IEEE
Yin W, Schütze H, Xiang B, Zhou B (2016) ABCNN: attention-based convolutional neural network for modeling sentence pairs. TACL 4:259–272
Article Google Scholar
Yoo J, Cho M, Kim T, Kang U (2019) Knowledge extraction with no observable data. In: NeurIPS, pp. 2701–2710
Zhang X, Zhou X, Lin M, Sun J (2018) Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: CVPR, pp. 6848–6856
Zhu C, Han S, Mao H, Dally WJ (2017) Trained ternary quantization. In: ICLR
Zhuang Z, Tan M, Zhuang B, Liu J, Guo Y, Wu Q, Huang J, Zhu J (2018) Discrimination-aware channel pruning for deep neural networks. In: NeurIPS, pp. 883–894

Download references

Acknowledgements

This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) [NO.2021-0-01343, Artificial Intelligence Graduate School Program (Seoul National University)], [NO.2021-0-02068, Artificial Intelligence Innovation Hub (Artificial Intelligence Institute, Seoul National University)], [No.2017-0-01772, Development of QA systems for Video Story Understanding to pass the Video Turing Test], and [No.2020-0-00894, Flexible and Efficient Model Compression Method for Various Applications and Environments]. The Institute of Engineering Research at Seoul National University provided research facilities for this work. The ICT at Seoul National University provides research facilities for this study.

Author information

Jun-Gi Jang and Chun Quan have contributed equally to the work.

Authors and Affiliations

Department of Computer Science and Engineering, Seoul National University, Seoul, Korea
Jun-Gi Jang & U. Kang
CCB Fintech, Beijing, China
Chun Quan
Department of Computer Science, Stanford University, Stanford, CA, USA
Hyun Dong Lee

Authors

Jun-Gi Jang
View author publications
You can also search for this author inPubMed Google Scholar
Chun Quan
View author publications
You can also search for this author inPubMed Google Scholar
Hyun Dong Lee
View author publications
You can also search for this author inPubMed Google Scholar
U. Kang
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to U. Kang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The work was done while Hyun Dong Lee was at Seoul National University.

The work was done while Chun Quan was at Seoul National University.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Jang, JG., Quan, C., Lee, H.D. et al. Falcon: lightweight and accurate convolution based on depthwise separable convolution. Knowl Inf Syst 65, 2225–2249 (2023). https://doi.org/10.1007/s10115-022-01818-x

Download citation

Received: 14 May 2021
Revised: 11 December 2022
Accepted: 16 December 2022
Published: 17 January 2023
Issue Date: May 2023
DOI: https://doi.org/10.1007/s10115-022-01818-x

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Falcon: lightweight and accurate convolution based on depthwise separable convolution

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fast and Robust Compression of Deep Convolutional Neural Networks

A Lightweight Neural Network Combining Dilated Convolution and Depthwise Separable Convolution

Convolutional Neural Network Compression via Tensor-Train Decomposition on Permuted Weight Tensor with Automatic Rank Determination

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now