Abstract
Deep neural networks typically with a fixed activation function at each neuron, have shown breakthrough performances. The fixed activation function is not the optimal choice for different data distributions. Toward this end, this work improves the deep neural networks by proposing a novel and efficient activation scheme called “Mutual Activation” (MAC). A non-static activation function is adaptively learned in the training phase of deep network. Furthermore, the proposed activation neuron cooperating with maxout is a potent higher-order function approximator, which can break through the convex curve limitation. Experimental results on object recognition benchmarks demonstrate the effectiveness of the proposed activation scheme.
Similar content being viewed by others
References
Agostinelli F, Hoffman MD, Sadowski PJ, Baldi P (2015) Learning activation functions to improve deep neural networks. In: ICLR
Chang JR, Chen YS (2015) Batch-normalized maxout network in network. arXiv:1511.02583 1511.02583
Clevert DA, Unterthiner T, Hochreiter S (2016) Fast and accurate deep network learning by exponential linear units. In: ICLR
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: AISTATS
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: AISTATS
Goodfellow IJ, Warde-Farley D, Mirza M, Courville AC, Bengio Y (2013) Maxout networks. In: ICML
Gulcehre C, Cho K, Pascanu R, Bengio Y (2014) Learned-norm pooling for deep feedforward and recurrent neural networks. In: ECML
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: ICCV, pp 1026–1034
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: CVPR, pp 770–778
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678
Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Lee CY, Xie S, Gallagher PW, Zhang Z, Tu Z (2015) Deeply-supervised nets. In: AISTATS
Li Z, Tang J (2015) Weakly supervised deep metric learning for community-contributed image retrieval. IEEE Trans Multimed 17(11):1989–1999
Li Z, Tang J (2017) Weakly supervised deep matrix factorization for social image understanding. IEEE Trans Image Process 26(1):276–288
Liang M, Hu X (2015) Recurrent convolutional neural network for object recognition. In: CVPR
Lin M, Chen Q, Yan S (2014) Network in network. In ICLR
Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: ICML, vol 30
Maaten LVD, Hinton G (2017) Visualizing data using t-sne. JMLR 9 (2605):2579–2605
Mishkin D, Matas J (2016) All you need is a good init. In: ICLR
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: ICML
Romero A, Ballas N, Kahou SE, Chassang A, Gatta C, Bengio Y Fitnets: Hints for thin deep nets. In: ICLR
Shang W, Sohn K, Almeida D, Lee H (2016) Understanding and improving convolutional neural networks via concatenated rectified linear units. In: ICML
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR
Srivastava N, Hinton GE, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. JMLR 15:1929–1958
Srivastava RK, Greff K, Schmidhuber J (2015) Highway networks. arXiv:1505.00387
Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: CVPR, pp 1–9
Wang Q, Gao J, Yuan Y (2017) A joint convolutional neural networks and context transfer for street scenes labeling. IEEE Trans Intell Transp Syst PP(99):1–14
Wang Q, Wan J, Yuan Y (2017) Deep metric learning for crowdedness regression. IEEE Trans Circuits Syst Video Technol PP(99):1–1
Acknowledgments
This work was partially supported by the 973 Program (Project No. 2014CB347600), the National Natural Science Foundation of China (Grant No. 61772275, 61720106004, 61672285 and 61672304) and the Natural Science Foundation of Jiangsu Province (BK20170033).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhou, H., Li, Z. Deep networks with non-static activation function. Multimed Tools Appl 78, 197–211 (2019). https://doi.org/10.1007/s11042-018-5702-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-5702-5