Elsevier

Neurocomputing

Volume 275, 31 January 2018, Pages 1132-1139
Neurocomputing

Deep neural networks with Elastic Rectified Linear Units for object recognition

https://doi.org/10.1016/j.neucom.2017.09.056Get rights and content

Abstract

Rectified Linear Unit (ReLU) is crucial to the recent success of deep neural networks (DNNs). In this paper, we propose a novel Elastic Rectified Linear Unit (EReLU) that focuses on processing the positive part of input. Unlike previous variants of ReLU that typically adopt linear or piecewise linear functions to represent the positive part, EReLU is characterized by that each positive value scales within a moderate range like a spring during training stage. On test time, EReLU becomes standard ReLU. EReLU improves model fitting with no extra parameters and little overfitting risk. Furthermore, we propose Elastic Parametric Rectified Linear Unit (EPReLU) by taking advantage of EReLU and parametric ReLU (PReLU). EPReLU is able to further improve the performance of networks. In addition, we present a new training strategy to train DNNs with EPReLU. Experiments on four benchmarks including CIFAR10, CIFAR10, SVHN and ImageNet 2012 demonstrate the effectiveness of both EReLU and EPReLU.

Introduction

Deep neural networks (DNNs) [1], [5], [7], [11], [32], [33], [34], [36], [38], [42], [43] have brought tremendous rise in performance to a variety of computer vision tasks, including image classification [12], [18], [27], [29], [39], object detection [4], [6], [9], [17], [28], [30], [47], [49] and image retrieval [44], [48]. This success is mainly attributed to advances in three aspects: more powerful network structures [24], better training strategies and effective regularization techniques against over fitting. Firstly, neural networks are becoming more and more powerful due to increased depth and width, along with sophisticated layer designing techniques. Secondly, training strategies such as stochastic gradient descent (SGD) [20], non-saturating nonlinear activation function and batch normalization (BN) [14] have proven to be effective ways of training deep networks. Lastly, Regularization techniques such as Dropout [18] could effectively combat overfiting.

Among these advances, non-saturating nonlinear activation function (e.g., Rectified Linear Unit (ReLU) [26]) is an important step toward a feasible deep neural network. Deep neural networks with ReLU train much faster than their equivalents with saturating nonlinearity. Fast learning has a great influence on the performance of large models trained on large datasets. ReLU simply keeps the positive part and prunes the negative part to zero. This non-linear function has superiority over saturating nonlinearity in that the derivative of the positive part is a constant value. Therefore, ReLU does not suffer from the vanishing gradients. Recently, there have been more efforts that concentrate on the study of non-saturating nonlinear activation functions. These methods can be categorized into two aspects. On the one hand, some approaches pay special attention to the negative part. They use fixed, or learnable, or randomized coefficients to control the slopes of the negative part. On the other hand, some other methods adopt more complex piecewise linear functions that are formulated by several learnable parameters to deal with the whole input.

In this paper, we propose a novel Elastic Rectified Linear Unit (EReLU) that allows the positive part to fluctuate during training stage. The motivation for this is clear: similar samples are likely to generate similar responses at the same place. Therefore, it is able to strengthen the robustness of networks by making the response fluctuate within a moderate range. In addition, we propose Elastic Parametric Rectified Linear Unit (EPReLU) by fusing EReLU and parametric ReLU (PReLU) [13]. That is, the positive part and the negative part are processed independently using two different activation methods. Further improvement is observed by adopting this compounded activation strategy.

The contributions and merits of this paper are summarized as follows. (1) We propose a novel Elastic Rectified Linear Unit (EReLU) that mainly deals with the positive part of input. EReLU makes the positive part fluctuate within a moderate range in each epoch during training stage, which strengthens the robustness of the network model. (2) We also propose a compounded activation strategy called Elastic Parametric Rectified Linear Unit (EPReLU) by fusing EReLU and parametric ReLU (PReLU). EReLU focuses on the positive part whereas PReLU deals with the negative part. EPReLU brings in a further accuracy gain. (3) We present a new training strategy to train deep neural networks with EPReLU. During training, EReLU and PReLU are alternately used instead of being adopted at the same time. This alternate updating method plays an important role in training network with EPReLU.

This paper is organized as follows. Section 2 reviews the related work. Section 3 presents the proposed method. The experimental results are given in Section 4. Finally, Section 5 concludes this paper.

Section snippets

Related work

Rectified Linear Unit (ReLU) [26], along with several other key components, contributes a lot to the recent success of deep neural networks. ReLU takes the place of saturating activation functions such as sigmoid and tanh that have been used for a long time. This is mainly attributed to its non-saturated property that is able to solve the so called vanishing gradient and at the same time accelerates the convergence speed. Since the advent of rectifier networks, much effort has been paid to

Proposed method

In this section, we first introduce two activation units: ReLU and PReLU, which helps illustrate our method. Next, we describe our Elastic Rectified Linear Unit (EReLU). Then, we present Elastic Parametric Rectified Linear Unit (EPReLU) by fusing EReLU and PReLU.

Experiments

We evaluate the proposed EReLU and EPReU on four standard benchmark datasets: CIFAR10 [19], CIFAR100 [19], SVHN [10] and ImageNet 2012 [18]. We compare EReLU and EPReLU with several other activation functions, including APL (Adaptive Piecewise Linear Unit) [2], SReLU (S-shaped Rectified Linear Unit) [16], PReLU (Parametric ReLU) [13], ELU (Exponential Linear Unit) [3], PELU (Parametric Exponential Linear Unit) [40]. We also compare our methods with other networks that have achieved the

Conclusion

In this paper, we have presented a novel activation unit called Elastic Rectified Linear Unit (EReLU). During training stage, EReLU allows the positive part of input to fluctuate within a moderate rang constrained by a uniform distribution. This kind of fluctuation makes the network robust to the variation within the same category. Furthermore, we have proposed Elastic Parametric Rectified Linear Unit (EPReLU) by fusing EReLU and PReLU. EPReLU is able to further improve the performance via a

Xiaoheng Jiang received the B.S.degree and M.S. degree in electronic engineering from the Tianjin University, China, in 2010 and 2013, respectively. He is currently a Ph.D. candidate in the Tianjin University and his supervisor is Professor Yanwei Pang. His research interests include deep learning, object detection, and image analysis.

References (49)

  • CaoJ. et al.

    Pedestrian detection inspired by appearance constancy and shape symmetry

    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

    (2016)
  • B. Chandra et al.

    Fast learning in deep neural networks

    Neurocomputing

    (2016)
  • GuoY. et al.

    Deep learning for visual understanding: a review

    Neurocomputing

    (2016)
  • R. Girshick

    Fast R-CNN

    Proceedings of the IEEE International Conference on Computer Vision

    (2015)
  • I.J. Goodfellow et al.

    Multi-digit number recognition from street view imagery using deep convolutional neural networks

    CoRR

    (2013)
  • HongC. et al.

    Multimodal deep autoencoder for human pose recovery

    IEEE Trans. Image Process.

    (2015)
  • HeK. et al.

    Deep residual learning for image recognition

    Proceedings of the IEEE International Conference on Computer Vision and Pattern Recogniton

    (2016)
  • HeK. et al.

    Delving deep into rectifiers: surpassing human-level performance on imagenet classification

    Proceedings of the IEEE Conference on Computer Vision

    (2015)
  • S. Ioffe et al.

    Batch normalization: accelerating deep network training by reducing internal covariate shift

    CoRR

    (2015)
  • JiangX. et al.

    Cascaded subpatch networks for effective CNNs

    CoRR

    (2016)
  • JinX. et al.

    Deep learning with s-shaped rectified linear activation units

    CoRR

    (2015)
  • JiangX. et al.

    Speed up deep neural network based pedestrian detection by sharing features across multi-scale models

    Neurocomputing

    (2016)
  • A. Krizhevsky et al.

    Imagenet classification with deep convolutional neural networks

    Proceedings of the Advances in Neural Information Processing Systems

    (2012)
  • A. Krizhevsky et al.

    Learning multiple layers of features from tiny images

    (2009)
  • Cited by (91)

    View all citing articles on Scopus

    Xiaoheng Jiang received the B.S.degree and M.S. degree in electronic engineering from the Tianjin University, China, in 2010 and 2013, respectively. He is currently a Ph.D. candidate in the Tianjin University and his supervisor is Professor Yanwei Pang. His research interests include deep learning, object detection, and image analysis.

    Yanwei Pang received the Ph.D. degree in electronic engineering from the University of Science and Technology of China in 2004. Currently, he is a professor of the Tianjin University, China. His research interests include object detection, image processing, and deep learning, in which he has published more than 100 scientific papers including 30 IEEE Transactions papers. He was an associate editor and guest editor of the Neurocomputing, International Journal of Image & Graphics, International Journal of Computer Mathematics.

    Xuelong Li is a full professor with the Center for OPTical IMagery Analysis and Learning (OPTIMAL), State Key Laboratory of Transient Optics and Photonics, Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an 710119, Shaanxi, PR China.

    Jing Pan received her B.S degree in Mechanical Engineering from the North China Institute of Technology (now North University of China), Taiyuan, China, in 2002, and her M.S degree in Precision Instrument and Mechanism from the University of Science and Technology of China, Hefei, China, in 2007. She is currently a Lecturer with the School of Electronic Engineering, Tianjin University of Technology and Education, Tianjin, China. Meanwhile, she is pursuing her Ph.D. degree in the Tianjin University, China. Her research interests include computer vision and pattern recognition.

    Yinghong Xie Received her B.S degree from the Shenyang Jianzhu University, Shenyang, China, in 1999 and her M.S degree and Ph.D. degree both from the Northeastern University. She has been a Postdoctoral in the Tianjin University. She is currently an associate professor with the Shenyang University. Her search interests include image processing, object tracking, and machine learning.

    View full text