Skip to main content
Log in

PokerNet: Expanding Features Cheaply via Depthwise Convolutions

  • Research Article
  • Published:
International Journal of Automation and Computing Aims and scope Submit manuscript

Abstract

Pointwise convolution is usually utilized to expand or squeeze features in modern lightweight deep models. However, it takes up most of the overall computational cost (usually more than 90%). This paper proposes a novel Poker module to expand features by taking advantage of cheap depthwise convolution. As a result, the Poker module can greatly reduce the computational cost, and meanwhile generate a large number of effective features to guarantee the performance. The proposed module is standardized and can be employed wherever the feature expansion is needed. By varying the stride and the number of channels, different kinds of bottlenecks are designed to plug the proposed Poker module into the network. Thus, a lightweight model can be easily assembled. Experiments conducted on benchmarks reveal the effectiveness of our proposed Poker module. And our PokerNet models can reduce the computational cost by 7.1%–15.6%. PokerNet models achieve comparable or even higher recognition accuracy than previous state-of-the-art (SOTA) models on the ImageNet ILSVRC2012 classification dataset. Code is available at https://github.com/diaomin/pokernet.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. A. Krizhevsky, I. Sutskever, G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS, Lake Tahoe, USA, pp. 1097–1105, 2012. DOI: https://doi.org/10.5555/2999134.2999257.

    Google Scholar 

  2. K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. [Online], Available: https://arxiv.org/abs/1409.1556, 2015.

  3. K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.

    Google Scholar 

  4. R. Girshick. Fast R-CNN. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Santiago, Chile, pp. 1440–1448, 2015. DOI: https://doi.org/10.1109/ICCV.2015.169.

    Google Scholar 

  5. K. M. He, G. Gkioxari, P. Dollár, R. Girshick. Mask R-CNN. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 2980–2988, 2017. DOI: https://doi.org/10.1109/ICCV.2017.322.

    Google Scholar 

  6. T. Y. Lin, P. Dollár, R. Girshick, K. M. He, B. Hariharan, S. Belongie. Feature pyramid networks for object detection. In Proceedings of IEEE Conference On computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 936–944, 2017. DOI: https://doi.org/10.1109/CVPR.2017.106.

    Google Scholar 

  7. J. Long, E. Shelhamer, T. Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 3431–3440, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298965.

    Google Scholar 

  8. L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille. Semantic image segmentation with deep convolutional nets and fully connected CRFs. [Online], Available: https://arxiv.org/abs/1412.7062, 2016.

  9. W. Y. Chen, X. Y. Gong, X. M. Liu, Q. Zhang, Y. Li, Z. Y. Wang. FasterSeg: Searching for faster real-time semantic segmentation. In Proceedings of the 4th International Conference on Learning Representations, OpenReview. net, Addis Ababa, Ethiopia, 2020.

  10. H. Li, A. Kadav, I. Durdanovic, H. Samet, H. P. Graf. Pruning filters for efficient convNets. In Proceedings of the 5th International Conference on Learning Representations, OpenReview.net, Toulon, France, 2017.

  11. Y. H. He, X. Y. Zhang, J. Sun. Channel pruning for accelerating very deep neural networks. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 1398–1406, 2017. DOI: https://doi.org/10.1109/ICCV.2017.155.

    Google Scholar 

  12. I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, Y. Bengio. Binarized neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 4114–4122, 2016. DOI: https://doi.org/10.5555/3157382.3157557.

  13. W. Tang, G. Hua, L. Wang. How to train a compact binary neural network with high accuracy? In Proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA, pp. 2625–2631, 2017. DOI: https://doi.org/10.5555/3298483.3298617.

  14. A. G. Howard, M. L. Zhu, B. Chen, D. Kalenichenko, W. J. Wang, T. Weyand, M. Andreetto, H. Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. [Online], Available: https://arxiv.org/abs/1704.14861, 2017.

  15. X. Y. Zhang, X. Y. Zhou, M. X. Lin, J. Sun. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 6848–6856, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00716.

    Google Scholar 

  16. G. Hinton, O. Vinyals, J. Dean. Distilling the knowledge in a neural network. [Online], Available: https://arxiv.org/abs/1503.02531, 2015.

  17. S. You, C. Xu, C. Xu, D. C. Tao. Learning from multiple teacher networks. In Proceedings of the 23rd ACM SIGK-DD International Conference on Knowledge Discovery and Data Mining, ACM, Halifax, Canada, pp. 1285–1294, 2017. DOI: https://doi.org/10.1145/3097983.3098135.

    Chapter  Google Scholar 

  18. H. T. Chen, Y. H. Wang, C. Xu, Z. H. Yang, C. J. Liu, B. X. Shi, C. J. Xu, C. Xu, Q. Tian. Data-free learning of student networks. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 3513–3521, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00361.

    Google Scholar 

  19. K. Han, Y. H. Wang, Q. Tian, J. Y. Guo, C. J. Xu, C. Xu. Ghostnet: More features from cheap operations. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 1577–1586, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00165.

    Google Scholar 

  20. M. Sandler, A. Howard, M. L. Zhu, A. Zhmoginov, L. C. Chen. MobileNetV2: Inverted residuals and linear bottle-necks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 4510–4520, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00474.

    Google Scholar 

  21. A. Howard, M. Sandler, B. Chen, W. J. Wang, L. C. Chen, M. X. Tan, G. Chu, V. Vasudevan, Y. K. Zhu, R. M. Pang, H. Adam, Q. Le. Searching for mobileNetV3. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 1314–1324, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00140.

    Google Scholar 

  22. N. N. Ma, X. Y. Zhang, H. T. Zheng, J. Sun. Shufflenet V2: Practical guidelines for efficient CNN architecture design. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 122–138, 2018. DOI: https://doi.org/10.1007/978-3-030-01268-9_8.

    Google Scholar 

  23. M. Lin, Q. Chen, S. C. Yan. Network in network. [Online], Available: https://arxiv.org/abs/1312.4400, 2014.

  24. F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, K. Keutzer. SqueezeNet: Alexnet-level accuracy with 50x fewer parameters and < 0.5 MB model size. [Online], Available: https://arxiv.org/abs/1602.07360, 2016.

  25. B. C. Wu, A. Wan, X. Y. Yue, P. Jin, S. C. Zhao, N. Golmant, A. Gholaminejad, J. Gonzalez, K. Keutzer. Shift: A zero FLOP, zero parameter alternative to spatial convolutions. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 9127–9135, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00951.

    Google Scholar 

  26. H. Cai, L. G. Zhu, S. Han. ProxylessNAS: Direct neural architecture search on target task and hardware. In Proceedings of the 7th International Conference on Learning Representations, OpenReview.net, New Orleans, USA, 2019.

  27. M. X. Tan, B. Chen, R. M. Pang, V. Vasudevan, M. Sandler, A. Howard, Q. V. Le. MnasNet: Platform-aware neural architecture search for mobile. In Proceedings IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 2815–2823, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00293.

    Google Scholar 

  28. C. X. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L. J. Li, F. F. Li, A. Yuille, J. Huang, K. Murphy. Progressive neural architecture search. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 19–35, 2018. DOI: https://doi.org/10.1007/978-3-03001286-5_2.

    Google Scholar 

  29. B. Zoph, Q. V. Le. Neural architecture search with reinforcement learning. In Proceedings of the 5th International Conference on Learning Representations, OpenReview.net, Toulon, France, 2017.

  30. B. Baker, O. Gupta, N. Naik, R. Raskar. Designing neural network architectures using reinforcement learning. In Proceedings of the 5th International Conference on Learning Representations, OpenReview.net, Toulon, France, 2017.

  31. B. Zoph, V. Vasudevan, J. Shlens, Q. V. Le. Learning transferable architectures for scalable image recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 8697–8710, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00907.

    Google Scholar 

  32. H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, J. Dean. Efficient neural architecture search via parameters sharing. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, pp. 4095–4104, 2018.

  33. H. X. Liu, K. Simonyan, Y. M. Yang. Darts: Differentiable architecture search. In Proceedings of the 7th International Conference on Learning Representations, OpenReview.net, New Orleans, USA, 2019.

  34. B. C. Wu, X. L. Dai, P. Z. Zhang, Y. H. Wang, F. Sun, Y. M. Wu, Y. D. Tian, P. Vajda, Y. Q. Jia, K. Keutzer. Fb-Net: Hardware-aware efficient convNet design via differentiable neural architecture search. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 10726–10734, 2019. DOI: https://doi.org/10.1109/CVPR.2019.01099.

    Google Scholar 

  35. Y. H. He, J. Lin, Z. J. Liu, H. R. Wang, L. J. Li, S. Han. AMC: AutoML for model compression and acceleration on mobile devices. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 815–832, 2018. DOI: https://doi.org/10.1007/978-3-030-01238-2_48.

    Google Scholar 

  36. X. L. Dai, P. Z. Zhang, B. C. Wu, H. X. Yin, F. Sun, Y. H. Wang, M. Dukhan, Y. Q. Hu, Y. M. Wu, Y. Q. Jia, P. Vajda, M. Uyttendaele, N. K. Jha. ChamNet: Towards efficient network design through platform-aware model adaptation. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 11390–11399, 2019. DOI: https://doi.org/10.1109/CVPR.2019.01166.

    Google Scholar 

  37. A. Wan, X. L. Dai, P. Z. Zhang, Z. J. He, Y. D. Tian, S. N. Xie, B. C. Wu, M. Yu, T. Xu, K. Chen, P. Vajda, J. E. Gonzalez. FbNetV2: Differentiable neural architecture search for spatial and channel dimensions. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, USA, pp. 12962–12971, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.01298.

    Google Scholar 

  38. X. L. Dai, A. Wan, P. Z. Zhang, B. C. Wu, Z. J. He, Z. Wei, K. Chen, Y. D. Tian, M. Yu, P. Vajda, J. E. Gonzalez. FBNetV3: Joint architecture-recipe search using neural acquisition function. [Online], Available: https://arxiv.org/abs/2006.02049, 2020.

  39. M. Z. Shen, K. Han, C. J. Xu, Y. H. Wang. Searching for accurate binary neural architectures. In Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop, IEEE, Seoul, Korea, pp. 2041–2044, 2019. DOI: https://doi.org/10.1109/ICCVW.2019.00256.

    Google Scholar 

  40. S. Han, H. Z. Mao, W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. [Online], Available: https://arxiv.org/abs/1510.00149, 2016.

  41. S. P. Gui, H. N. Wang, H. C. Yang, C. Yu, Z. Y. Wang, J. Liu. Model compression with adversarial robustness: A unified optimization framework. In Proceedings of Advances in Neural Information Processing Systems, Neur-IPS, Vancouver, Canada, pp. 1283–1294, 2019.

    Google Scholar 

  42. J. H. Luo, J. X. Wu, W. Y. Lin. Thinet: A filter level pruning method for deep neural network compression. In Proceedings of IEEE International Conference On Computer Vision, IEEE, Venice, Italy, pp. 5068–5076, 2017. DOI: https://doi.org/10.1109/ICCV.2017.541.

    Google Scholar 

  43. C. J. Liu, Y. H. Wang, K. Han, C. J. Xu, C. Xu. Learning instance-wise sparsity for accelerating deep models. In Proceedings of the 24th International Joint Conference on Artificial Intelligence, Macao, China, pp. 3001–3007, 2019.

  44. W. Wen, C. P. Wu, Y. D. Wang, Y. R. Chen, H. Li. Learning structured sparsity in deep neural networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, pp. 2082–2090, 2016. DOI: https://doi.org/10.5555/3157096.3157329.

  45. Z. C. Liu, H. Y. Mu, X. Y. Zhang, Z. C. Guo, X. Yang, K. T. Cheng, J. Sun. Metapruning: Meta learning for automatic neural network channel pruning. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Seoul, Korea, pp. 3295–3304, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00339.

    Google Scholar 

  46. J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 7132–7141, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00745.

    Google Scholar 

  47. S. Ioffe, C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, pp. 448–456, 2015.

  48. J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, F. F. Li. ImageNet: A large-scale hierarchical image database. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Miami, USA, pp. 248–255, 2009. DOI: https://doi.org/10.1109/CVPR.2009.5206848.

    Google Scholar 

  49. X. B. Fu, S. L. Yue, D. Y. Pan. Camera-based basketball scoring detection using convolutional neural network. International Journal of Automation and Computing vol. 18, no. 2, pp. 266–276, 2018. DOI: https://doi.org/10.1007/s11633-020-1259-7.

    Article  Google Scholar 

  50. K. Aukkapinyo, S. Sawangwong, P. Pooyoi, W. Kusakunniran. Localization and classification of rice-grain images using region proposals-based convolutional neural network. International Journal of Automation and Computing, vol. 17, no. 2, pp. 233–246, 2020. DOI: https://doi.org/10.1007/s11633-019-1207-6.

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Nos. 61525306, 61633021, 61721004, 61806194, U1803261 and 61976132), Major Project for New Generation of AI (No. 2018AAA0100 400), Beijing Nova Program (No. Z201100006820079), Shandong Provincial Key Research and Development Program (No. 2019JZZY010119), and CAS-AIR.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Wang.

Additional information

Recommended by Associate Editor Bin Luo

Colored figures are available in the online version at https://link.springer.com/journal/11633

Wei Tang received the B. Sc. degree in automation from Harbin Engineering University (HEU), China in 2013. Currently, he is a Ph. D. degree candidate in National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA), China under the guidance of professor Liang Wang. He has published papers in the conferences such as AAAI Conference on Artificial Intelligence (AAAI), Chinese Conference on Computer Vision (CCCV). He has won the Best Student Paper Award in CCCV 2015 and the Star of Tomorrow Award in internship of Microsoft Research Asian (MSRA).

His research interests include deep learning and computer vision, model compression and acceleration.

Yan Huang received the B. Sc. degree in information and computing science from University of Electronic Science and Technology of China (UESTC), China in 2012, and the Ph. D. degree in pattern recognition and intelligent systems from University of Chinese Academy of Sciences (UCAS), China in 2017. Since July 2017, He has joined the National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA), China as associate researcher. He has published more than 50 papers in international journals and conferences in related fields. Related papers have won the CVPR Workshop Best Paper Award, IEEE Conference on Computer Vision and Pattern Recognition (ICPR) Best Student Paper Award, etc. He was selected in Beijing Science and Technology Star Program and Microsoft Star Casting Program. He was the Co-chair of multimodal symposiums on International Conference on Pattern Recognition (CVPR) and International Conference on Computer Vision (ICCV). He has won the special award of president of Chinese Academy of Sciences, Excellent Doctoral Dissertation Award of Chinese society of artificial intelligence, Baidu scholarship and NVIDIA Innovation Research Award.

His research interests include computer vision and multimodal data analysis.

Liang Wang received the B. Eng. degree in radio technology and M. Eng. degree in circuit system from Anhui University, China in 1997 and 2000, respectively, and the Ph. D. degree in pattern recognition and intelligent systems from Institute of Automation, Chinese Academy of Sciences (CASIA), China in 2004. From 2004 to 2010, he was a research assistant at Imperial College London, UK and Monash University, Australia, a research fellow at the University of Melbourne, Australia, and a lecturer at the University of Bath, UK, respectively. Currently, he is a full professor of the Hundred Talents Program at the National Lab of Pattern Recognition, CASIA. He has widely published in highly ranked international journals such as IEEE Transactions on Pattern Analysis and Machine Intelligence and IEEE Transactions on Image Processing, and leading international conferences such as CVPR, ICCV, and International Conference on Data Mining (ICDM). He is a Senior Member of the IEEE, and an IAPR Fellow.

His research interests include machine learning, pattern recognition, and computer vision.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, W., Huang, Y. & Wang, L. PokerNet: Expanding Features Cheaply via Depthwise Convolutions. Int. J. Autom. Comput. 18, 432–442 (2021). https://doi.org/10.1007/s11633-021-1288-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11633-021-1288-x

Keywords

Navigation