Skip to main content
Log in

A novel softplus linear unit for deep convolutional neural networks

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Current improvements in the performance of deep neural networks are partly due to the proposition of rectified linear units. A ReLU activation function outputs zero for negative component, inducing the death of some neurons and a bias shift of the outputs, which causes oscillations and impedes learning. According to the theory that “zero mean activations improve learning ability”, a softplus linear unit (SLU) is proposed as an adaptive activation function that can speed up learning and improve performance in deep convolutional neural networks. Firstly, for the reduction of the bias shift, negative inputs are processed using the softplus function, and a general form of the SLU function is proposed. Secondly, the parameters of the positive component are fixed to control vanishing gradients. Thirdly, the rules for updating the parameters of the negative component are established to meet back- propagation requirements. Finally, we designed deep auto-encoder networks and conducted several experiments with them on the MNIST dataset for unsupervised learning. For supervised learning, we designed deep convolutional neural networks and conducted several experiments with them on the CIFAR-10 dataset. The experiments have shown faster convergence and better performance for image classification of SLU-based networks compared with rectified activation functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Noda K, Yamaguchi Y, Nakadai K et al (2015) Audio- visual speech recognition using deep learning. Appl Intell 42(4):722– 737

    Article  Google Scholar 

  2. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks[C] Proceeding of the 26th Annual Conference on Neural Information Processing Systems Lake Taheo, USA, pp 1097–1105

  3. Szegedy C, Liu W, Jia YQ et al (2015) Going deeper with convolutions[C] Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA, pp 1–9

  4. Ross G, Jeff D, Darrell, Trevor D et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation[C] Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA, pp 580–587

  5. Wang N, Li S, Gupta A et al (2015) Transferring rich feature hierarchies for robust visual tracking[OL]. arXiv:1501.04587

  6. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition[OL]. arXiv:1409.1556

  7. He K, Zhang X, Ren S et al (2015) Deep residual learning for image recognition [OL]. arXiv:1512.03385

  8. Trottier L, Giguère P, Chaib-draa B (2016) Parametric Exponential Linear Unit for Deep Convolutional Neural Networks [OL]. arXiv:1605.09332

  9. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines[C] Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel, pp 807– 814

  10. Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models[C] Proceeding of the 30th International Conference on Machine Learning. Atlanta, GA, USA, vol 30

  11. He K, Zhang X, Ren S et al (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification[C] Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile, pp 1026–1034

  12. Xu B, Wang N, Chen T et al (2015) Empirical evaluation of rectified activations in convolutional network [OL] arXiv:1505.00853

  13. Clevert DA, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus) [OL] arXiv:1511.07289

  14. Desjardins G, Pascanu R, Courville A et al (2013) Metric-free natural gradient for joint-training of boltzmann machines[J]. arXiv:1301.3545

  15. Desjardins G, Simanyan K, Pascanu R et al (2015) Natural neural network[OL] arXiv:1507.00210

  16. Olivier Y (2013) Riemannian metrics for neural networks i: feedforward networks[OL] arXiv:1303.0818

  17. Glorot X, Bordes A, Bengio Y (2011) Deep Sparse Rectifier Neural Networks [C] Proceeding of the 14th International Conference on Artificial Intelligence and Statistics. Fort Landerdale, FL, USA, pp 315–314

  18. Senior A, Lei X (2014) Fine context, low-rank, softplus deep neural networks for mobile speech recognition [C] Proceeding of IEEE International Conference on Acoustic, Speech and Signal Processing, Florence, Italy, pp 7644–7648

  19. Krizhevsky A, Hinton GE (2009) Learning multiple layers of features from tiny images[R]. Computer Science Department, University of toronto, Tech 1(4):7

    Google Scholar 

  20. Lin M, Chen Q, Yan S. (2013) Network in network[OL] arXiv:1312.4400

  21. Njikam ANS, Zhao H (2016) A novel activation function for multilayer feed-forward neural networks[J]. Appl Intell 45(1):75–82

    Article  Google Scholar 

  22. Lee CY, Xie S, Gallagher P et al (2015) Deeply-supervised Nets[C] Proceeding of the 18th international conference on artificial intelligence and statistics. San Diego, California, USA, vol 2, p 6

Download references

Acknowledgments

This work was supported by grants from Air Force Engineering University. The authors would like to thank all of the team members of D605 Laboratory.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huizhen Zhao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, H., Liu, F., Li, L. et al. A novel softplus linear unit for deep convolutional neural networks. Appl Intell 48, 1707–1720 (2018). https://doi.org/10.1007/s10489-017-1028-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-017-1028-7

Keywords

Navigation