Efficient densely connected convolutional neural networks
Graphical abstract
Introduction
In recent years, deep neural networks (DNNs) have achieved great success in many fields [1], [2], [3], [4]. For instance, H-LSTCM [5] and CCG-LSTM [6] achieved the state-of-the-art result for collective activity recognition. HGDNs [7] can customize a suitable scale for each pixel, which shows competitive results on the semantic segmentation task. ResNet [8] can surpass the human level in image classification tasks. Recurrent Neural Networks (RNN) have outstanding performance in modeling the sequential data. SC-RNN [9] can simultaneously capture the spatial coherence among joints and the temporal evolution among skeletons on a co-attention feature map, which can efficiently predict human motion. However, DNNs also have some problems to be addressed. For example, DNNs are prone to overfitting under the condition of lacking sufficient training data. Qi [10] proposed the Lipschitz regularization theory and algorithms for a novel Loss-Sensitive Generative Adversarial Network (LS-GAN), which had an outstanding performance on image classification tasks by semi-supervised learning when the labeled data is very limited. Generalized deep transfer networks (DTNs) were proposed, which can adequately mitigate the problem of insufficient training images by bringing in rich labels from the textual domain [11], [12].
In DNNs, the parameter redundancy causes overfitting and resource consumption, which is more serious in convolutional neural networks (CNNs). Inceptive, to obtain high accuracy, the CNNs [13], [14], such as the winners of ImageNet Large Scale Visual Recognition Competition (ILSVRC) [15] become very deep and have a great deal of channels. VGG-16 [13] has 13 convolutional layers and 3 fully connected layers with 138 million parameters. ResNet-100 [14] and DenseNet-121 [16] have 25 million and 8.1 million parameters, respectively. Xie et al proposed filter-in-filter scheme to enhance the expressibility of a filter [17]. A CNN that promotes competition of multiple-size filters was proposed to improve the accuracy [18]. These CNNs with high accuracy and abundant parameters are designed for servers that have abundant computational resources. Wu et al. try to find a good compromise between the depth and width and propose a group of relatively shallow CNNs. In consequence, these models cannot be used to perform real-time inference on low-compute devices. Deep CNNs have a wide range of applications. Sometimes, CNNs need to be deployed on some low-compute and low-power mobile devices, such as self-driving cars, smartphones, and robotics. Therefore, the CNNs with small model size, little parameters, and low computation cost, but still high accuracy become an urgent request.
To reduce the parameters and FLOPs in CNNs, some approaches for designing efficient CNNs have been proposed, e.g., pruning [19], [20], quantization [21], [22] and more efficient network architectures [23], [24]. Compared with VGG [13], ResNet [14] reduced the computational cost by a factor 5 × , DenseNet [16] by a factor of 10 × , but obtained higher accuracies on ImageNet. The ResNet and DenseNet used 1 × 1 convolution kernels to reduce parameters and computational cost. Furthermore, the ResNet addressed vanishing gradient problem by shortcut connection, and the DenseNet induced feature reuse by directly each layer with all layers before it. However, in ResNet and DenseNet, the standard 3 × 3 and 1 × 1 convolutions have a good deal of parameters and computational cost. Subsequently, the more efficient MobileNet [25] reduced approximately 25 × computational cost with competitive accuracies by using efficient depthwise separable convolution.
In this paper, to design more efficient CNN, we propose two novel and efficient architectures DenseDsc and Dense2Net which are stacks of dense blocks. In DenseDsc, DenseDsc block of the dense block consists of two convolution layers. The first layer is the efficient depthwise convolution layer. There are two parallel depthwise convolution layers. The second layer is the 1 × 1 group convolution layer, which can fuse information efficiently. In Dense2Net, a Dense2Net block of the dense block consists of a 3 × 3 group convolution layer for extracting features and a 1 × 1 convolution for fusing information and reducing channels. It is well known that both depthwise and group convolutions are more efficient than standard convolution. The proposed DenseDsc and Dense2Net can induce feature reuse by dense connectives and keep a lightweight computation by efficient convolution methods. We define some hyperparameters for proposed DenseDsc and Dense2Net, which can easily adjust the model size and help DenseDsc and Dense2Net can be used in different cases.
We evaluate DenseDsc and Dense2Net on two highly competitive benchmark datasets, CIFAR and ImageNet. For CIFAR, the experimental results show that Dense2Net and DenseDsc achieve state-of-the-art results on CIFAR-10 and CIFAR-100, respectively. Compared with efficient CNNs with less than 1.0 M parameters for CIFAR, the accuracies of two proposed CNNs are very competitive. For ImageNet, Dense2Net achieves the state-of-the-art results in the manual CNNs with less than 10 M parameters.
Our main contributions are summarized as follows.
- •
We propose a novel building block called DenseDsc block, which is very efficient by using depthwise separable convolution.
- •
We introduce an efficient and novel densely connected CNN called DenseDsc by stacking dense block constructed by DenseDsc block, which achieves state-of-the-art results on CIFAR-100 in CNNs with less than 0.5 M parameters.
- •
We propose an efficient and novel Dense2Net block by using channel shuffle and group convolution operations, which improves the parameter efficiency and the information transfer.
- •
We present an efficient densely connected CNN Dense2Net, which achieves state-of-the-art results on CIFAR-10 in CNNs with less than 0.5 M parameter. The accuracy of Dense2Net on ImageNet is higher than all manual CNNs with less than 10 M parameters.
- •
We study the effect of growth rate k on the accuracy, as the k increase, and the accuracies of DenseDsc and Dense2Net are increased obviously.
The remainder of this paper is organized as follows. In Section 2, some related works are presented. The details of Dense2Net and DenseDsc are introduced in Section 3. The experiments results are discussed in Section 4 and we give the conclusion in Section 5.
Section snippets
Related work and background
In this section, a brief introduction about improving convolution efficiency is presented and the DenseNet [16] is reviewed in detail. Besides these kinds of approaches, there are some other approaches for efficient CNNs architecture, such as knowledge distillation [26] and neural architecture search networks [27], which are not explored in this paper.
Proposed method
In this section, the proposed two efficient architectures DenseDsc and Dense2Net are introduced in details. And then the parameter efficiency is analyzed. The tendency of FLOPs is similar with that of parameter.
Experiments and analysis
In this section, we use CIFAR and ImageNet datasets to evaluate the performance of DenseDsc and Dense2Net. We mainly focus on efficiency and accuracy.
Conclusion
In this paper, we propose two efficient densely connected convolutional neural networks which are called DenseDsc and Dense2Net, respectively. The two networks take advantage of the dense connection to improve feature reuse and use efficient convolutions for improving efficiency. In DenseDsc, the efficient depthwise separable convolution is used to improve the efficiency. In Dense2Net, we use group convolution to improve the parameter efficiency. And the multi-levels group convolutions can
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
This research work was partly supported by the National Natural Science Foundation of China (Project No. 61750110529, 61850410535), the Natural Science Foundation of Jiangsu Province (Project No. BK20161147), the Postgraduate Research and Practice Innovation Program of Jiangsu Province (SJCX18_0058).
Guoqing Li received the B.S. degree from Qingdao University, Qingdao, China, in 2014, the M.S. degree from South China Normal University, Guangzhou, China, in 2017. He is currently pursuing the Ph.D. degree with the National ASIC Engineering Technology Research Center, School of Electronics Science and Engineering, Southeast University, Nanjing, China. His current research interests include computer vision, convolutional neural network, deep learning hardware accelerator.
References (43)
- et al.
Recent advances in convolutional neural networks
Pattern Recognit.
(2018) - et al.
Pruning filters for efficient convnets
International Conference on Learning Representations
(2017) - et al.
Thinet: pruning cnn filters for a thinner net
IEEE Trans Pattern Anal Mach Intell
(2019) - et al.
Recent advances in convolutional neural network acceleration
Neurocomputing
(2019) - et al.
Efficient convolution neural networks for object tracking using separable convolution and filter pruning
IEEE Access
(2019) - et al.
Macro unit-based convolutional neural network for very light-weight deep learning
Image Vis. Comput.
(2019) - et al.
Pelee: A real-time object detection system on mobile devices
Advances in Neural Information Processing Systems
(2018) - et al.
Social anchor-unit graph regularized tensor completion for large-scale image retagging
IEEE Trans. Pattern Anal. Mach. Intell.
(2019) - et al.
Line-cnn: end-to-end traffic line detection with line proposal unit
IEEE Trans. Intell. Transp. Syst.
(2019) - et al.
Monocular depth estimation with hierarchical fusion of dilated CNNs and soft-weighted-sum inference
Pattern Recognit.
(2018)
Hierarchical long short-term concurrent memory for human interaction recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Coherence constrained graph lstm for group activity recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hierarchically gated deep networks for semantic segmentation
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Identity mappings in deep residual networks
European Conference on Computer Vision
Loss-sensitive generative adversarial networks on lipschitz densities
Int. J. Comput. Vis.
Weakly-shared deep transfer networks for heterogeneous-domain knowledge propagation
Proceedings of the 23rd ACM international conference on Multimedia
Generalized deep transfer networks for knowledge propagation in heterogeneous domains
ACM Trans. Multimed. Comput. Commun. Applica. (TOMM)
Very deep convolutional networks for large-scale image recognition
3rd International Conference on Learning Representations
Deep residual learning for image recognition
IEEE Conference on Computer Vision and Pattern Recognition
Imagenet large scale visual recognition challenge
Int. J. Comput. Vis.
Cited by (73)
Reliability enhancement of state of health assessment model of lithium-ion battery considering the uncertainty with quantile distribution of deep features
2024, Reliability Engineering and System SafetyUnveiling the secrets of online consumer choice: A deep learning algorithmic approach to evaluate and predict purchase decisions through EEG responses
2024, Information Processing and ManagementA novel cancelable finger vein templates based on LDM and RetinexGan
2023, Pattern RecognitionDGFaceNet: Lightweight and efficient face recognition
2023, Engineering Applications of Artificial IntelligenceFeature Sampling based on Multilayer Perceptive Neural Network for image quality assessment
2023, Engineering Applications of Artificial Intelligence
Guoqing Li received the B.S. degree from Qingdao University, Qingdao, China, in 2014, the M.S. degree from South China Normal University, Guangzhou, China, in 2017. He is currently pursuing the Ph.D. degree with the National ASIC Engineering Technology Research Center, School of Electronics Science and Engineering, Southeast University, Nanjing, China. His current research interests include computer vision, convolutional neural network, deep learning hardware accelerator.
Meng Zhang received the B.S. degree in Electrical Engineering from the China University of Mining and Technology, Xuzhou, China, in 1986, and the M.S. and Ph.D. degrees in Bioelectronics Engineering from Southeast University, Nanjing, China, in 1993 and 2014, respectively. He is currently a professor in National ASIC System Research Center, College of Electronic Science and Engineering of Southeast University, Nanjing, PR China. He is a faculty adviser of Ph.D. graduates. His research interests include digital signal and image processing, digital communication systems, wireless sensor networks, information security and assurance, cryptography, and digital integrated circuit design, machine learning etc. He is an author or coauthor of more than 50 referred journal and international conference papers and a holder of more than 60 patents, including some PCT, US patents.
Jiaojie Li is a M.S. student at National ASIC Center in School of Microelectronics, Southeast University, China. He received the B.S. degree in Dalian University of Technology, Dalian, China, in 2018. His research interests include digital integrated circuit design, deep learning techniques etc.
Feng Lv is a M.S. student at National ASIC Center in School of Microelectronics, Southeast University, China. He received the B.S. degree in Southeast University, NanJing, China, in 2018. His research interests include computer version, pattern recognition.
Guodong Tong received the B.S and M.S degrees from the Yangzhou University of physical science and technology, Yangzhou, China, in 2013 and 2018. Respectively, He worked for Huawei & China Mobile one year. He is currently pursuing the Ph.D. degree with the School of Electronic Science & Engineering, National ASIC System Engineer Technology Research Center, Southeast University, Nanjing, China. His research interests include circuit timing analysis, AI, GCN and GCN processes network tables. He is a student member of the IEEE Advancing Technology for Humanity.