PokerNet: Expanding Features Cheaply via Depthwise Convolutions

Tang, Wei; Huang, Yan; Wang, Liang

doi:10.1007/s11633-021-1288-x

PokerNet: Expanding Features Cheaply via Depthwise Convolutions

Research Article
Published: 24 March 2021

Volume 18, pages 432–442, (2021)
Cite this article

International Journal of Automation and Computing Aims and scope Submit manuscript

106 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Pointwise convolution is usually utilized to expand or squeeze features in modern lightweight deep models. However, it takes up most of the overall computational cost (usually more than 90%). This paper proposes a novel Poker module to expand features by taking advantage of cheap depthwise convolution. As a result, the Poker module can greatly reduce the computational cost, and meanwhile generate a large number of effective features to guarantee the performance. The proposed module is standardized and can be employed wherever the feature expansion is needed. By varying the stride and the number of channels, different kinds of bottlenecks are designed to plug the proposed Poker module into the network. Thus, a lightweight model can be easily assembled. Experiments conducted on benchmarks reveal the effectiveness of our proposed Poker module. And our PokerNet models can reduce the computational cost by 7.1%–15.6%. PokerNet models achieve comparable or even higher recognition accuracy than previous state-of-the-art (SOTA) models on the ImageNet ILSVRC2012 classification dataset. Code is available at https://github.com/diaomin/pokernet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer

Multipath feature recalibration DenseNet for image classification

Article 05 September 2020

Bolin Chen, Tiesong Zhao, … Liqun Lin

Feature pyramid of bi-directional stepped concatenation for small object detection

Article 05 March 2021

Qiyuan Zheng & Ying Chen

References

A. Krizhevsky, I. Sutskever, G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS, Lake Tahoe, USA, pp. 1097–1105, 2012. DOI: https://doi.org/10.5555/2999134.2999257.
Google Scholar
K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. [Online], Available: https://arxiv.org/abs/1409.1556, 2015.
K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.
Google Scholar
R. Girshick. Fast R-CNN. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Santiago, Chile, pp. 1440–1448, 2015. DOI: https://doi.org/10.1109/ICCV.2015.169.
Google Scholar
K. M. He, G. Gkioxari, P. Dollár, R. Girshick. Mask R-CNN. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 2980–2988, 2017. DOI: https://doi.org/10.1109/ICCV.2017.322.
Google Scholar
T. Y. Lin, P. Dollár, R. Girshick, K. M. He, B. Hariharan, S. Belongie. Feature pyramid networks for object detection. In Proceedings of IEEE Conference On computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 936–944, 2017. DOI: https://doi.org/10.1109/CVPR.2017.106.
Google Scholar
J. Long, E. Shelhamer, T. Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 3431–3440, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298965.
Google Scholar
L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille. Semantic image segmentation with deep convolutional nets and fully connected CRFs. [Online], Available: https://arxiv.org/abs/1412.7062, 2016.
W. Y. Chen, X. Y. Gong, X. M. Liu, Q. Zhang, Y. Li, Z. Y. Wang. FasterSeg: Searching for faster real-time semantic segmentation. In Proceedings of the 4th International Conference on Learning Representations, OpenReview. net, Addis Ababa, Ethiopia, 2020.
H. Li, A. Kadav, I. Durdanovic, H. Samet, H. P. Graf. Pruning filters for efficient convNets. In Proceedings of the 5th International Conference on Learning Representations, OpenReview.net, Toulon, France, 2017.
Y. H. He, X. Y. Zhang, J. Sun. Channel pruning for accelerating very deep neural networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1398–1406, 2017. DOI: https://doi.org/10.1109/ICCV.2017.155.
Google Scholar
I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, Y. Bengio. Binarized neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 4114–4122, 2016. DOI: https://doi.org/10.5555/3157382.3157557.
W. Tang, G. Hua, L. Wang. How to train a compact binary neural network with high accuracy? In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA, pp. 2625–2631, 2017. DOI: https://doi.org/10.5555/3298483.3298617.
A. G. Howard, M. L. Zhu, B. Chen, D. Kalenichenko, W. J. Wang, T. Weyand, M. Andreetto, H. Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. [Online], Available: https://arxiv.org/abs/1704.14861, 2017.
X. Y. Zhang, X. Y. Zhou, M. X. Lin, J. Sun. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 6848–6856, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00716.
Google Scholar
G. Hinton, O. Vinyals, J. Dean. Distilling the knowledge in a neural network. [Online], Available: https://arxiv.org/abs/1503.02531, 2015.
S. You, C. Xu, C. Xu, D. C. Tao. Learning from multiple teacher networks. In Proceedings of the 23rd ACM SIGK-DD International Conference on Knowledge Discovery and Data Mining, ACM, Halifax, Canada, pp. 1285–1294, 2017. DOI: https://doi.org/10.1145/3097983.3098135.
Chapter Google Scholar
H. T. Chen, Y. H. Wang, C. Xu, Z. H. Yang, C. J. Liu, B. X. Shi, C. J. Xu, C. Xu, Q. Tian. Data-free learning of student networks. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 3513–3521, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00361.
Google Scholar
K. Han, Y. H. Wang, Q. Tian, J. Y. Guo, C. J. Xu, C. Xu. Ghostnet: More features from cheap operations. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 1577–1586, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00165.
Google Scholar
M. Sandler, A. Howard, M. L. Zhu, A. Zhmoginov, L. C. Chen. MobileNetV2: Inverted residuals and linear bottle-necks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 4510–4520, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00474.
Google Scholar
A. Howard, M. Sandler, B. Chen, W. J. Wang, L. C. Chen, M. X. Tan, G. Chu, V. Vasudevan, Y. K. Zhu, R. M. Pang, H. Adam, Q. Le. Searching for mobileNetV3. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 1314–1324, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00140.
Google Scholar
N. N. Ma, X. Y. Zhang, H. T. Zheng, J. Sun. Shufflenet V2: Practical guidelines for efficient CNN architecture design. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 122–138, 2018. DOI: https://doi.org/10.1007/978-3-030-01268-9_8.
Google Scholar
M. Lin, Q. Chen, S. C. Yan. Network in network. [Online], Available: https://arxiv.org/abs/1312.4400, 2014.
F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, K. Keutzer. SqueezeNet: Alexnet-level accuracy with 50x fewer parameters and < 0.5 MB model size. [Online], Available: https://arxiv.org/abs/1602.07360, 2016.
B. C. Wu, A. Wan, X. Y. Yue, P. Jin, S. C. Zhao, N. Golmant, A. Gholaminejad, J. Gonzalez, K. Keutzer. Shift: A zero FLOP, zero parameter alternative to spatial convolutions. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 9127–9135, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00951.
Google Scholar
H. Cai, L. G. Zhu, S. Han. ProxylessNAS: Direct neural architecture search on target task and hardware. In Proceedings of the 7th International Conference on Learning Representations, OpenReview.net, New Orleans, USA, 2019.
M. X. Tan, B. Chen, R. M. Pang, V. Vasudevan, M. Sandler, A. Howard, Q. V. Le. MnasNet: Platform-aware neural architecture search for mobile. In Proceedings IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 2815–2823, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00293.
Google Scholar
C. X. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L. J. Li, F. F. Li, A. Yuille, J. Huang, K. Murphy. Progressive neural architecture search. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 19–35, 2018. DOI: https://doi.org/10.1007/978-3-03001286-5_2.
Google Scholar
B. Zoph, Q. V. Le. Neural architecture search with reinforcement learning. In Proceedings of the 5th International Conference on Learning Representations, OpenReview.net, Toulon, France, 2017.
B. Baker, O. Gupta, N. Naik, R. Raskar. Designing neural network architectures using reinforcement learning. In Proceedings of the 5th International Conference on Learning Representations, OpenReview.net, Toulon, France, 2017.
B. Zoph, V. Vasudevan, J. Shlens, Q. V. Le. Learning transferable architectures for scalable image recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8697–8710, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00907.
Google Scholar
H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, J. Dean. Efficient neural architecture search via parameters sharing. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, pp. 4095–4104, 2018.
H. X. Liu, K. Simonyan, Y. M. Yang. Darts: Differentiable architecture search. In Proceedings of the 7th International Conference on Learning Representations, OpenReview.net, New Orleans, USA, 2019.
B. C. Wu, X. L. Dai, P. Z. Zhang, Y. H. Wang, F. Sun, Y. M. Wu, Y. D. Tian, P. Vajda, Y. Q. Jia, K. Keutzer. Fb-Net: Hardware-aware efficient convNet design via differentiable neural architecture search. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 10726–10734, 2019. DOI: https://doi.org/10.1109/CVPR.2019.01099.
Google Scholar
Y. H. He, J. Lin, Z. J. Liu, H. R. Wang, L. J. Li, S. Han. AMC: AutoML for model compression and acceleration on mobile devices. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 815–832, 2018. DOI: https://doi.org/10.1007/978-3-030-01238-2_48.
Google Scholar
X. L. Dai, P. Z. Zhang, B. C. Wu, H. X. Yin, F. Sun, Y. H. Wang, M. Dukhan, Y. Q. Hu, Y. M. Wu, Y. Q. Jia, P. Vajda, M. Uyttendaele, N. K. Jha. ChamNet: Towards efficient network design through platform-aware model adaptation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 11390–11399, 2019. DOI: https://doi.org/10.1109/CVPR.2019.01166.
Google Scholar
A. Wan, X. L. Dai, P. Z. Zhang, Z. J. He, Y. D. Tian, S. N. Xie, B. C. Wu, M. Yu, T. Xu, K. Chen, P. Vajda, J. E. Gonzalez. FbNetV2: Differentiable neural architecture search for spatial and channel dimensions. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 12962–12971, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.01298.
Google Scholar
X. L. Dai, A. Wan, P. Z. Zhang, B. C. Wu, Z. J. He, Z. Wei, K. Chen, Y. D. Tian, M. Yu, P. Vajda, J. E. Gonzalez. FBNetV3: Joint architecture-recipe search using neural acquisition function. [Online], Available: https://arxiv.org/abs/2006.02049, 2020.
M. Z. Shen, K. Han, C. J. Xu, Y. H. Wang. Searching for accurate binary neural architectures. In Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop, IEEE, Seoul, Korea, pp. 2041–2044, 2019. DOI: https://doi.org/10.1109/ICCVW.2019.00256.
Google Scholar
S. Han, H. Z. Mao, W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. [Online], Available: https://arxiv.org/abs/1510.00149, 2016.
S. P. Gui, H. N. Wang, H. C. Yang, C. Yu, Z. Y. Wang, J. Liu. Model compression with adversarial robustness: A unified optimization framework. In Proceedings of Advances in Neural Information Processing Systems, Neur-IPS, Vancouver, Canada, pp. 1283–1294, 2019.
Google Scholar
J. H. Luo, J. X. Wu, W. Y. Lin. Thinet: A filter level pruning method for deep neural network compression. In Proceedings of IEEE International Conference On Computer Vision, IEEE, Venice, Italy, pp. 5068–5076, 2017. DOI: https://doi.org/10.1109/ICCV.2017.541.
Google Scholar
C. J. Liu, Y. H. Wang, K. Han, C. J. Xu, C. Xu. Learning instance-wise sparsity for accelerating deep models. In Proceedings of the 24th International Joint Conference on Artificial Intelligence, Macao, China, pp. 3001–3007, 2019.
W. Wen, C. P. Wu, Y. D. Wang, Y. R. Chen, H. Li. Learning structured sparsity in deep neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 2082–2090, 2016. DOI: https://doi.org/10.5555/3157096.3157329.
Z. C. Liu, H. Y. Mu, X. Y. Zhang, Z. C. Guo, X. Yang, K. T. Cheng, J. Sun. Metapruning: Meta learning for automatic neural network channel pruning. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 3295–3304, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00339.
Google Scholar
J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7132–7141, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00745.
Google Scholar
S. Ioffe, C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, pp. 448–456, 2015.
J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, F. F. Li. ImageNet: A large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Miami, USA, pp. 248–255, 2009. DOI: https://doi.org/10.1109/CVPR.2009.5206848.
Google Scholar
X. B. Fu, S. L. Yue, D. Y. Pan. Camera-based basketball scoring detection using convolutional neural network. International Journal of Automation and Computing vol. 18, no. 2, pp. 266–276, 2018. DOI: https://doi.org/10.1007/s11633-020-1259-7.
Article Google Scholar
K. Aukkapinyo, S. Sawangwong, P. Pooyoi, W. Kusakunniran. Localization and classification of rice-grain images using region proposals-based convolutional neural network. International Journal of Automation and Computing, vol. 17, no. 2, pp. 233–246, 2020. DOI: https://doi.org/10.1007/s11633-019-1207-6.
Article Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Nos. 61525306, 61633021, 61721004, 61806194, U1803261 and 61976132), Major Project for New Generation of AI (No. 2018AAA0100 400), Beijing Nova Program (No. Z201100006820079), Shandong Provincial Key Research and Development Program (No. 2019JZZY010119), and CAS-AIR.

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Wei Tang, Yan Huang & Liang Wang
Center for Research on Intelligent Perception and Computing (CRIPAC), Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
Wei Tang, Yan Huang & Liang Wang

Authors

Wei Tang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Liang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Liang Wang.

Additional information

Recommended by Associate Editor Bin Luo

Colored figures are available in the online version at https://link.springer.com/journal/11633

Wei Tang received the B. Sc. degree in automation from Harbin Engineering University (HEU), China in 2013. Currently, he is a Ph. D. degree candidate in National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA), China under the guidance of professor Liang Wang. He has published papers in the conferences such as AAAI Conference on Artificial Intelligence (AAAI), Chinese Conference on Computer Vision (CCCV). He has won the Best Student Paper Award in CCCV 2015 and the Star of Tomorrow Award in internship of Microsoft Research Asian (MSRA).

His research interests include deep learning and computer vision, model compression and acceleration.

Yan Huang received the B. Sc. degree in information and computing science from University of Electronic Science and Technology of China (UESTC), China in 2012, and the Ph. D. degree in pattern recognition and intelligent systems from University of Chinese Academy of Sciences (UCAS), China in 2017. Since July 2017, He has joined the National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA), China as associate researcher. He has published more than 50 papers in international journals and conferences in related fields. Related papers have won the CVPR Workshop Best Paper Award, IEEE Conference on Computer Vision and Pattern Recognition (ICPR) Best Student Paper Award, etc. He was selected in Beijing Science and Technology Star Program and Microsoft Star Casting Program. He was the Co-chair of multimodal symposiums on International Conference on Pattern Recognition (CVPR) and International Conference on Computer Vision (ICCV). He has won the special award of president of Chinese Academy of Sciences, Excellent Doctoral Dissertation Award of Chinese society of artificial intelligence, Baidu scholarship and NVIDIA Innovation Research Award.

His research interests include computer vision and multimodal data analysis.

Liang Wang received the B. Eng. degree in radio technology and M. Eng. degree in circuit system from Anhui University, China in 1997 and 2000, respectively, and the Ph. D. degree in pattern recognition and intelligent systems from Institute of Automation, Chinese Academy of Sciences (CASIA), China in 2004. From 2004 to 2010, he was a research assistant at Imperial College London, UK and Monash University, Australia, a research fellow at the University of Melbourne, Australia, and a lecturer at the University of Bath, UK, respectively. Currently, he is a full professor of the Hundred Talents Program at the National Lab of Pattern Recognition, CASIA. He has widely published in highly ranked international journals such as IEEE Transactions on Pattern Analysis and Machine Intelligence and IEEE Transactions on Image Processing, and leading international conferences such as CVPR, ICCV, and International Conference on Data Mining (ICDM). He is a Senior Member of the IEEE, and an IAPR Fellow.

His research interests include machine learning, pattern recognition, and computer vision.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tang, W., Huang, Y. & Wang, L. PokerNet: Expanding Features Cheaply via Depthwise Convolutions. Int. J. Autom. Comput. 18, 432–442 (2021). https://doi.org/10.1007/s11633-021-1288-x

Download citation

Received: 23 December 2020
Accepted: 01 February 2021
Published: 24 March 2021
Issue Date: June 2021
DOI: https://doi.org/10.1007/s11633-021-1288-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PokerNet: Expanding Features Cheaply via Depthwise Convolutions

Abstract

Access this article

Similar content being viewed by others

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer

Multipath feature recalibration DenseNet for image classification

Feature pyramid of bi-directional stepped concatenation for small object detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

PokerNet: Expanding Features Cheaply via Depthwise Convolutions

Abstract

Access this article

Similar content being viewed by others

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer

Multipath feature recalibration DenseNet for image classification

Feature pyramid of bi-directional stepped concatenation for small object detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation