Cross Connected Network for Efficient Image Recognition

Yang, Lu; Song, Qing; Li, Zuoxin; Wu, Yingqi; Li, Xiaojie; Hu, Mengjie

doi:10.1007/978-3-030-20887-5_4

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11361))

Included in the following conference series:

Asian Conference on Computer Vision

2054 Accesses

Abstract

In this work, we describe a novel and highly efficient convolutional neural network for image recognition, which we term the “Cross Connected Network” (CrossNet). We have creatively introduced the Pod structure, where the feature map of depthwise convolutions can be reused within the same Pod by cross connection. Such a design can make the CrossNet have high performance while less computing resource, especially suitable for mobile devices with very limited computing power. Additionally, we find that depthwise convolutions with large receptive field has better accuracy/computation trade-offs, and further improves CrossNet performance. Our experiments on ImageNet classification and MSCOCO object detection demonstrate that CrossNet can improve the state-of-the-art performance of lightweight networks (such as MobileNets-V1/-V2, ShuffleNets and CondenseNet). We have tested the actual inference time on an ARM-based mobile device. The CrossNet still gets the best performance. Code and models are public available (https://github.com/soeaver/CrossNet-PyTorch).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In this paper, MFLOPs refers to the number of million multiplication-addition operations, MParams refers to the number of million parameters.
2.
https://pytorch.org/.
3.
https://github.com/Tencent/ncnn.

References

Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.: DeepLab: semantic image segmentation with deep convolutional Nets, Atrous convolution, and fully connected CRFs. arXiv:1606.00915 (2016)
Chen, L., Papandreou, G., Schroff, F., Adam, H.: Rethinking Atrous convolution for semantic image segmentation. arXiv:1706.05587 (2017)
Chen, L., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with Atrous separable convolution for semantic image segmentation. arXiv:1802.02611 (2018)
Chapter Google Scholar
Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., Feng, J.: Dual path networks. In: NIPS (2017)
Google Scholar
Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: CVPR (2017)
Google Scholar
Dai, J., et al.: Deformable convolutional networks. In: ICCV (2017)
Google Scholar
Gatys, L., Ecker, A., Bethge, M.: Image style transfer using convolutional neural networks. In: CVPR (2016)
Google Scholar
Girshick, R.: Fast R-CNN. In: ICCV (2015)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR (2014)
Google Scholar
Goyal, P., et al.: Accurate, large minibatch SGD: training imagenet in 1 hour. arXiv:1706.02677 (2017)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: ICCV (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_38. arXiv:1603.05027
Chapter Google Scholar
Hong, S., Roh, B., Kim, K., Cheon, Y., Park, M.: PVANet: lightweight deep neural networks for real-time object detection. arXiv:1611.08588 (2016)
Howard, A., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. In: CVPR (2017)
Google Scholar
Huang, G., Liu, S., Maaten, L., Weinberger, K.: Condensenet: an efficient densenet using learned group convolutions. arXiv:1711.09224 (2017)
Huang, G., Liu, Z., Weinberger, K.: Densely connected convolutional networks. In: CVPR (2017)
Google Scholar
Iandola, F., Han, S., Moskewicz, M., Ashraf, K., Dally, W., Keutzer, K.: Squeezenet: alexnet-level accuracy with 50x fewer parameters and <0.5MB model size. arXiv:1602.07360 (2016)
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar
LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1, 541–551 (1989)
Article Google Scholar
Li, S., Xu, X., Nie, L., Chua, T.: Laplacian-steered neural style transfer. In: ACM MM (2017)
Google Scholar
Lin, T., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: CVPR (2017)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, S., Huang, D., Wang, Y.: Receptive field block net for accurate and fast object detection. arXiv:1711.07767 (2017)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2015)
Google Scholar
Matan, O., Burges, C., LeCun, Y., Denker, J.: Multi-digit recognition using a space displacement neural network. In: NIPS (1991)
Google Scholar
Nair, V., Hinton, G.: Rectified linear units improve restricted Boltzmann machines. In: ICML (2010)
Google Scholar
Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameter sharing. arXiv:1802.03268 (2018)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR (2016)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS (2015)
Google Scholar
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. IJCV (2015)
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.: Inverted residuals and linear bottlenecks: mobile networks for classification, detection and segmentation. arXiv:1801.04381 (2018)
Szegedy, C., et al.: Going deeper with convolutions. In: CVPR (2015)
Google Scholar
Wang, M., Liu, B., Foroosh, H.: Design of efficient convolutional layers using single intra-channel convolution, topological subdivisioning and spatial bottleneck structure. arXiv:1608.04337 (2016)
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: CVPR (2017)
Google Scholar
Yu, J., Jiang, Y., Wang, Z., Cao, Z., Huang, T.: UnitBox: an advanced object detection network. In: ACM MM (2016)
Google Scholar
Zagoruyko, S., Komodakis, N.: Wide residual networks. In: BMVC (2016)
Google Scholar
Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. arXiv:1707.01083 (2017)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: CVPR (2017)
Google Scholar
Zhu, B., Chen, Y., Wang, J., Liu, S., Zhang, B., Tang, M.: Fast deep matting for portrait animation on mobile phone. In: ACM MM (2017)
Google Scholar
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.: Learning transferable architectures for scalable image recognition. arXiv:1707.07012 (2017)

Download references

Author information

Authors and Affiliations

Beijing University of Posts and Telecommunications, Beijing, 100876, China
Lu Yang, Qing Song, Yingqi Wu & Mengjie Hu
Beihang University, Beijing, 100191, China
Zuoxin Li & Xiaojie Li

Authors

Lu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Qing Song
View author publications
You can also search for this author in PubMed Google Scholar
Zuoxin Li
View author publications
You can also search for this author in PubMed Google Scholar
Yingqi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojie Li
View author publications
You can also search for this author in PubMed Google Scholar
Mengjie Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lu Yang .

Editor information

Editors and Affiliations

IIIT Hyderabad, Hyderabad, India
C. V. Jawahar
ANU, Canberra, ACT, Australia
Hongdong Li
Simon Fraser University, Burnaby, BC, Canada
Greg Mori
ETH Zurich, Zurich, Zürich, Switzerland
Konrad Schindler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, L., Song, Q., Li, Z., Wu, Y., Li, X., Hu, M. (2019). Cross Connected Network for Efficient Image Recognition. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11361. Springer, Cham. https://doi.org/10.1007/978-3-030-20887-5_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-20887-5_4
Published: 28 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20886-8
Online ISBN: 978-3-030-20887-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics