An Analysis of Activation Function Saturation in Particle Swarm Optimization Trained Neural Networks

Dennis, Cody; Engelbrecht, Andries P.; Ombuki-Berman, Beatrice M.

doi:10.1007/s11063-020-10290-z

An Analysis of Activation Function Saturation in Particle Swarm Optimization Trained Neural Networks

Published: 25 July 2020

Volume 52, pages 1123–1153, (2020)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Cody Dennis¹,
Andries P. Engelbrecht² &
Beatrice M. Ombuki-Berman¹

751 Accesses
36 Citations
Explore all metrics

Abstract

The activation functions used in an artificial neural network define how nodes of the network respond to input, directly influence the shape of the error surface and play a role in the difficulty of the neural network training problem. Choice of activation functions is a significant question which must be addressed when applying a neural network to a problem. One issue which must be considered when selecting an activation function is known as activation function saturation. Saturation occurs when a bounded activation function primarily outputs values close to its boundary. Excessive saturation damages the network’s ability to encode information and may prevent successful training. Common functions such as the logistic and hyperbolic tangent functions have been shown to exhibit saturation when the neural network is trained using particle swarm optimization. This study proposes a new measure of activation function saturation, evaluates the saturation behavior of eight common activation functions, and evaluates six measures of controlling activation function saturation in particle swarm optimization based neural network training. Activation functions that result in low levels of saturation are identified. For each activation function recommendations are made regarding which saturation control mechanism is most effective at reducing saturation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimizing Artificial Neural Network for Functions Approximation Using Particle Swarm Optimization

Neural Network Training Based Particle Swarm Optimization (PSO)

A New Approach for Momentum Particle Swarm Optimization

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Al Hazza MH, Adesta EY (2013) Investigation of the effect of cutting speed on the surface roughness parameters in CNC end milling using artificial neural network. In: IOP conference series: materials science and engineering, vol 53, IOP Publishing. https://doi.org/10.1088/1757-899X/53/1/012089
Bishop C (1995) Neural networks for pattern recognition. Oxford University Press, New York
MATH Google Scholar
Carvalho M, Ludermir R (2006) Particle swarm optimization of feed-forward neural networks with weight decay. In: Proceedings of the international conference on 3D digital imaging and modeling, pp 1–5
Center NGD (2019) Boulder sunspot number data. https://www.sws.bom.gov.au/Educational/2/3/6. Accessed 16 Mar 2019
Dahl G, Sainath T, Hinton G (2013) Improving deep neural networks for LVCSR using rectified linear units and dropout. In: Proceedings of the conference on acoustics, speech and signal processing, pp 8609–8613
Das M, Dulger L (2009) Signature vecification (SV) toolbox: applications of PSO-NN. Eng Appl Artific Intel 22(4):688–694
Article Google Scholar
Dreyfus G (2005) Neural networks: methodology and applications. Springer, Berlin
MATH Google Scholar
Dua D, Graff C (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 16 Mar 2019
Dugas C, Bengio Y, Belisle F, Nadeau C, Garcia R (2001) Incorporating second-order functional knowledge for better option pricing. In: Proceedings of the conference on advances in neural information processing systems, pp 472–478
Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: Proceedings of the sixth international symposium on micro machine and human science, pp 39–43
Eggensperger K, Lindauer M, Hoos H, Hutter F, Leyton-Brown K (2018) Efficient benchmarking of algorithm configurators via model-based surrogates. Mach Learn 107(1):15–41
Article MathSciNet Google Scholar
Elliott D (1993) A better activation function for artificial neural networks. Technical report T.R. 93-8, University of Maryland
Engelbrecht A (2012) Particle swarm optimization: velocity initialization. In: Proceedings of the congress on evolutionary computation, pp 1–8
Engelbrecht A (2013) Particle swarm optimization: global best or local best? In: BRICS congress on computational intelligence and 11th Brazilian congress on computational intelligence. IEEE, pp 124–135
Engelbrecht A, Cloete I, Geldenhuys J, Zurada J (1995) Automatic scaling using gamma learning for feedforward neural networks. In: Proceedings of the international workshop on artificial neural networks. Springer, pp 374–381
Fisher R (1936) Iris data set. https://archive.ics.uci.edu/ml/datasets/Iris. Accessed 2 Aug 2018
Forina M et al (1991) Wine data set. https://archive.ics.uci.edu/ml/datasets/Wine. Accessed 2 Aug 2018
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: International conference on artificial intelligence and statistics, pp 249–256
Golik P, Doetsch P, Ney H (2013) Cross-entropy vs. squared error training: a theoretical and experimental comparison. Interspeech 13:1756–1760
Google Scholar
Gudise V, Venayagamoorthy G (2003) Comparison of particle swarm optimization and backpropagation as training algorithms for neural networks. In: Proceedings of the swarm intelligence symposium, pp 110–117
Harrison K (2018) An analysis of parameter control mechanisms for the particle swarm optimization algorithm. Ph.D. thesis, University of Pretoria
Helwig S, Wanka R (2007) Particle swarm optimization in high dimensional bounded search spaces. In: Proceedings of the swarm intelligence symposium, pp 198–205
Helwig S, Wanka R (2008) Theoretical analysis of initial particle swarm behavior. In: Rudolph G, Jansen T, Beume N, Lucas S, Poloni C (eds) Proceedings of the international conference on parallel problem solving from nature, pp 889–898
Hutter F, Hoos H, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: Coello CA (ed) Learning and intelligent optimization. Springer, Heidelberg, pp 507–523
Chapter Google Scholar
Hutter F, Lücke J, Schmidt-Thieme L (2015) Beyond manual tuning of hyperparameters. KI-Künstliche Intelligenz 29(4):329–337
Article Google Scholar
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the international conference on neural networks, vol 4, pp 1942–1948
Kennedy J, Mendes R (2002) Population structure and particle swarm performance. In: Proceedings of the international congress on evolutionary computation, vol 2, pp 1671–1676
Lawrence S, Tsoi A, Back A (1996) Function approximation with neural networks and local methods: bias, variance and smoothness. In: Proceedings of the Australian conference on neural networks, australian national university, vol 1621
LeCun Y, Bottou L, Orr G, Müller K (2012) Efficient BackProp. Springer, Berlin, pp 9–48
Google Scholar
Maas A, Hannun A, Ng A (2013) Rectifier nonlinearities improve neural network acoustic models. in: Proceedings of the workshop on deep learning for audio, speech, and language processing, vol 30, pp 3–8
Mendes R, Cortez P, Rocha M, Neves J (2002) Particle swarms for feedforward neural network training. In: Proceedings of the international joint conference on neural networks, pp 1895–1899
Moody J, Hanson S, Krogh A, Hertz JA (1995) A simple weight decay can improve generalization. Adv Neural Inf Process Syst 4:950–957
Google Scholar
Oldewage E (2018) The perils of particle swarm optimization in high dimensional problem spaces. Master’s thesis, Department of Computer Science, University of Pretoria
Olorunda O, Engelbrecht A (2008) Measuring exploration/exploitation in particle swarms using swarm diversity. In: Proceedings of the international congress on evolutionary computation, pp 1128–1134
Platt J (1991) A resource-allocating network for function interpolation. Neural Comput 3(2):213–225
Article MathSciNet Google Scholar
Prechelt L (1994) PROBEN1—a set of benchmarks and benchmarking rules for neural network training algorithms. https://github.com/jeffheaton/proben1. Accessed 16 Mar 2019
Rakitianskaia A, Engelbrecht A (2012) Training feedforward neural networks with dynamic particle swarm optimisation. Int J Uncertain Fuzziness Knowl Based Syst 6(3):233–270
Google Scholar
Rakitianskaia A, Engelbrecht A (2014) Training high-dimensional neural networks with cooperative particle swarm optimiser. In: Proceedings of the international joint conference on neural networks, pp 4011–4018
Rakitianskaia A, Engelbrecht A (2014) Weight regularisation in particle swarm optimisation neural network training. In: Proceedings of the symposium on swarm intelligence, pp 1–8
Rakitianskaia A, Engelbrecht A (2015) Measuring saturation in neural networks. In: Proceedings of the symposium series on computational intelligence, pp 1423–1430
Rakitianskaia A, Engelbrecht A (2015) Saturation in PSO neural network training: good or evil? In: Proceedings of the international congress on evolutionary computation, pp 125–132
Rini DP, Shamsuddin SM, Yuhaniz SS (2011) Particle swarm optimization: technique, system and challenges. Int J Comput Appl ‘ 14(1):19–27
Google Scholar
Röbel A (1994) The dynamic pattern selection algorithm: effective training and controlled generalization of backpropagation neural networks. Technical report, Technische Universität Berlin
Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: Proceedings of the international congress on evolutionary computation, pp 69–73. https://doi.org/10.1109/ICEC.1998.699146
Stützle T, López-Ibáñez M (2019) Automated design of metaheuristic algorithms. Springer, Cham, pp 541–579
Google Scholar
van den Bergh F, Engelbrecht A (2000) Cooperative learning in neural networks using particle swarm optimizers. S Afr Comput J 26:84–90
Google Scholar
van Wyk A, Engelbrecht A (2010) Overfitting by PSO trained feedforward neural networks. In: Proceedings of the international congress on evolutionary computation, pp 1–8
van Wyk A, Engelbrecht A (2016) Analysis of activation functions for particle swarm optimized feedforward neural networks. In: Proceedings of the international congress on evolutionary computation, pp 423–430
Volschenk A, Engelbrecht A (2016) An analysis of competitive coevolutionary particle swarm optimizers to train neural network game tree evaluation functions. In: Tan Y, Shi Y, Niu B (eds) Advances in swarm intelligence. Springer, Cham, pp 369–380
Chapter Google Scholar
Werbos PJ (1974) Beyond regression: new tools for prediction and analysis in the behavioural sciences. Ph.D. thesis, Harvard University
Wessels L, Barnard E (1992) Avoiding false local minima by proper initialization of connections. Trans Neural Netw 3(6):899–905
Article Google Scholar
Wolberg W (1990) Breast cancer wisconsin (original) data set. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29. Accessed 2 Aug 2018
Wyk A, Engelbrecht A (2011) Lambda-gamma learning with feedforward neural networks using particle swarm optimization. In: Proceedings of the symposium on swarm intelligence, pp 1–8
Xiao X, Wang Z, Li Q, Xia S, Jiang Y (2017) Back-propagation neural network on markov chains from system call sequences: a new approach for detecting android malware with system call sequences. IET Inf Secur 11(1):8–15
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Brock University, St. Catharines, Canada
Cody Dennis & Beatrice M. Ombuki-Berman
Department of Industrial Engineering and Department of Computer Science, Stellenbosch University, Stellenbosch, South Africa
Andries P. Engelbrecht

Authors

Cody Dennis
View author publications
You can also search for this author in PubMed Google Scholar
Andries P. Engelbrecht
View author publications
You can also search for this author in PubMed Google Scholar
Beatrice M. Ombuki-Berman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Beatrice M. Ombuki-Berman.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 908 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dennis, C., Engelbrecht, A.P. & Ombuki-Berman, B.M. An Analysis of Activation Function Saturation in Particle Swarm Optimization Trained Neural Networks. Neural Process Lett 52, 1123–1153 (2020). https://doi.org/10.1007/s11063-020-10290-z

Download citation

Published: 25 July 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s11063-020-10290-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Analysis of Activation Function Saturation in Particle Swarm Optimization Trained Neural Networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Optimizing Artificial Neural Network for Functions Approximation Using Particle Swarm Optimization

Neural Network Training Based Particle Swarm Optimization (PSO)

A New Approach for Momentum Particle Swarm Optimization

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (pdf 908 KB)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An Analysis of Activation Function Saturation in Particle Swarm Optimization Trained Neural Networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Optimizing Artificial Neural Network for Functions Approximation Using Particle Swarm Optimization

Neural Network Training Based Particle Swarm Optimization (PSO)

A New Approach for Momentum Particle Swarm Optimization

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Supplementary material 1 (pdf 908 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation